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EDITORIAL 


Extremely large telescopes at risk 


mages of the cosmos from the Hubble Space Tele- 
scope and the James Webb Space Telescope have 
awed the public and astronomers alike. Until the 
Hubble, breakthroughs in astronomy came from big 
telescopes on mountain-top observatories—discov- 
eries that include the expansion of the Universe and 
planets orbiting other stars. A new generation of 
extremely large ground-based telescopes is under devel- 
opment, which, when paired with space-based observa- 
tories, will produce even more remarkable discoveries 
about our Universe. 

In the 1990s, two technological innovations revolu- 
tionized ground-based astronomy—segmented mirrors 
and adaptive optics. These advances made it possible 
to increase the size of telescope 
mirrors well beyond 8 m and to undo 
the blurring effects of Earth’s atmo- 
sphere. The extremely large tele- 
scopes (ELTs) under construction 
today, with mirrors that are 25 to 
39 m in diameter and adaptive optics, 
will have 100 times the light-gath- 
ering power and 10 times the im- 
age quality of the Hubble Space 
Telescope. These giant telescopes 
will search for signatures of life on 
exoplanets, reveal new insights on 
the nature and origin of black holes, 
and investigate the deep myster- 
ies of dark matter and dark energy. 
These technological marvels are also 
extremely expensive and complex to build and manage. 

The European Southern Observatory (ESO)—an 
organization of 16 member nations—is on track to 
complete the 39-m European ELT in the Chilean Ata- 
cama Desert by decade’s end. The two US projects, 
the Giant Magellan Telescope (GMT) in Chile (with 
a 25-m-diameter mirror) and the Thirty Meter Tele- 
scope (TMT) planned for Mauna Kea in Hawaii, were 
initiated almost 25 years ago by private institutions 
backed by generous philanthropy. Because of their dis- 
covery power, the 2020 Decadal Survey of Astronomy 
and Astrophysics recommended as its highest prior- 
ity for ground-based astronomy that the United States 
invest in at least one, preferably two, ELTs. However, 
the cost of each has now risen to around 3 billion US 
dollars, and international partners have been added. 
Both the GMT and TMT have substantial fundraising 
challenges, and each is hoping to bring the US Na- 
tional Science Foundation (NSF) on as a partner. It is 


“NSF must act 
soon and not 
leave GMT and 


TMT in their 
current state of 
limbo.” 


simply not possible for NSF to join both projects at 
the level needed to make each successful. Instead, NSF 
should take the lead in planning, building, and oper- 
ating a single telescope. International partners are a 
must, and an alliance with ESO might be desirable. 
NSF knows how to manage such agreements, and in- 
ternational partners will be more comfortable with a 
government-to-government arrangement. 

NSF transitioned the Vera C. Rubin Observatory in 
Chile from private funding to public funding. A simi- 
lar shift for an ELT is possible but will be more chal- 
lenging because international partners are in place and 
substantial investments have been made. Although NSF 
has built and operated big facilities that have produced 
stunning discoveries, such as the 
Laser Interferometer Gravitational- 
Wave Observatory, an ELT is beyond 
even that scale, and so NSF will 
have to work with Congress and the 
White House Office of Management 
and Budget to make the necessary 
structural changes at the agency to 
design, build, and operate it. 

With its strong connections to 
the scientific community, NSF can 
create a fair process to select which 
project it will move forward with. 
Because astronomers were hoping 
for two US ELTs, NSF and leaders 
in astronomy must explain why a 
single telescope is the only viable 
solution. NSF must act soon and not leave GMT and 
TMT in their current state of limbo. Both projects have 
made big investments to make the dream of a US ELT 
a reality, and each urgently needs to plan for its future, 
with or without NSF. 

For 100 years, big telescopes led by private institutions 
and funded by philanthropy made the US a world leader 
in astronomy. The cost and complexity of these ELTs have 
broken that model. Philanthropy is still essential to US 
astronomy because its nimbleness allows for more risk 
taking and early investment, as occurred with these big 
telescopes. Hopefully, philanthropy will continue to do 
so in the future. Building these amazing windows to our 
vast Universe is difficult and expensive. But it is worth 
the effort. These giant telescopes will help us to better 
understand our place in the Universe and will again 
demonstrate what humankind can do when we work to- 
gether to pursue lofty and ambitious goals. 

—Michael S. Turner 
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WHO scientist Natasha Crowcroft, in STAT, about measles deaths rising by 43% in 2022, to 136,000, 
most of them children in low-income countries. The COVID-19 pandemic reduced inoculation rates. 


Edited by Jeffrey Brainard 


CRISPR has helped patients with sickle cell disease, which causes red blood cells to form a sickled shape. 


U.K. approves CRISPR therapy for sickle cell 


n a world first, U.K. regulators last week approved a therapy that 
uses CRISPR, the Nobel Prize-winning gene-editing tool invented 
in 2012. The treatment has been shown to help people with beta 
thalassemia and sickle cell disease, both inherited blood disorders 
that involve defects in the oxygen-carrying protein hemoglobin. It 
relies on removing blood stem cells from patients, using CRISPR 
to turn on the gene for a fetal form of hemoglobin, then reinfusing the 
cells. The therapy, developed by the companies Vertex Pharmaceuticals 
and CRISPR Therapeutics, has been tested in clinical trials with dozens 
of patients. Almost all those with sickle cell stopped having debilitating 
pain “crises,” and most beta thalassemia patients were able to forgo the 
blood transfusions they previously needed. The treatment is expected 
to cost at least $2 million, raising questions about whether the United 
Kingdom’s National Health Service and U.S. insurance companies will 
cover it. U.S. regulators are expected to approve the therapy for sickle 
cell disease by 8 December and for beta thalassemia by 30 March 2024. 


862 24 NOVEMBER 2023 + VOL 382 ISSUE 6673 


https://avxhm.se/blogs/hillO 


House OKs gain-of-function ban 


BIOSAFETY | Although it may 

never become law, the U.S. House of 
Representatives last week approved an 
amendment that would ban National 
Institutes of Health (NIH) funding for 

studies that might make an actual or pos- 

sible human pathogen more dangerous. 
Sponsored by Representatives Thomas 

Massie (R-KY) and Mariannette Miller- . 
Meeks (R-IA), the amendment to a 2024 
House spending bill bars federal support 

for “any gain-of-function research involv- 
ing a potential pandemic pathogen.” The 
measure reflects a long-running debate 
about studies that give risky pathogens new 
abilities, to help infectious disease experts 
anticipate and prepare for pandemics. 

Some Republicans allege that the COVID- 

19 pandemic resulted from NIH-funded 
gain-of-function research in China, which 
the agency strongly denies. The American 
Society for Microbiology says the vaguely 
worded measure could halt noncontroversial 
work on annual flu vaccines, COVID-19 
treatments and vaccines, and routine studies 
of many common viruses. The Democratic- 
led Senate, which has given no evidence it 
supports a similar ban, has not yet voted on 
its version of the bill. 


ny 


EU to extend glyphosate use 


HERBICIDES | The European Commission 
last week said it would extend its approval 
of the controversial herbicide glyphosate 
for 10 more years—“subject to certain new 
conditions and restrictions.” Among them 
are a ban on using the compound before 
harvest—rather than before sowing—to 
wither crops and ease harvesting, and 
measures to curb its spread to untargeted 
plants. Besides its environmental risks, 
glyphosate has been linked to possible 
human health concerns, such as cancer. 
The decision to renew the authorization 
fell to the Commission because EU coun- 
tries failed to reach a qualified majority 
in favor of, or against, the reauthoriza- 
tion. Under the Commission’s approval, 
individual member states will still be able 
to restrict the use of products containing 
glyphosate if they deem it necessary, for 
example to protect biodiversity. 
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Big gift for high-risk research 


PHILANTHROPY | The California Institute 
of Technology (Caltech) announced last 
week that it has received a $400 million 
gift to award fellowships for U.S. faculty 
members in physics and chemistry to con- 
duct research promising “transformational 
discoveries.” The donor, Ross Brown, ran a 
company that provided equipment and ser- 
vices to the natural gas and industrial gas 
industries. In 2020, he provided money to 
launch the Brown Investigators Program. 
Caltech, which will now administer it, 

will invite selected research universities, 
chosen anew each year, to nominate faculty 
members who won tenure within the 

past 10 years. An independent panel will 
select at least eight recipients annually for 
awards of $2 million over 5 years. Caltech 
faculty members are ineligible, but the uni- 
versity will receive about $1 million a year 
for chemistry and physics research. 


Weather aids Ukraine’s ship strike 


PHYSICAL SCIENCE | On 13 April 2022, 

the flagship of Russia’s Black Sea fleet, the 
Moskva, was hit by a pair of Ukrainian 
missiles and sank hours later. It was a 
mysterious blow, as the vessel was at least 
120 kilometers from shore, a distance well 
outside the normal range of the coastal 
battery’s radar. New modeling suggests 

the precision strike was aided by weather 
conditions. That day, northerly cyclonic 
winds swept warm, dry air over a cool, 
moist boundary layer on the Black Sea. The 
resulting temperature inversion caused 
low-level clouds to form between the radar 
and the warship. The thick clouds would 
have allowed the radar’s pulses and pings to 
hug Earth’s curvature over a longer distance 
than usual, revealing the Russian ship, con- 
cludes a study posted online this month in 
the Bulletin of the American Meteorological 
Society. The authors stress that militaries 


A temperature inversion may have aided the radar 
that enabled Ukraine to cripple a Russian warship. 


SCIENCE science.org 


FUNDING 


NIH spends more on applied research than basic science 


or the past decade, more of the U.S. National Institutes of Health's (NIH's) extra- 

mural funding has gone to applied research than to basic, the agency's extramural 

research director reported last month. In 2022, applied research got 56% of the total. 

Applied projects, including translational research and clinical trials of new drugs, 

cost more on average than basic research, the director, Michael Lauer, says in a blog 
post. (Using a different method, the agency's budget office has charted a similar rise but 
finds basic research has held steady at just over 50% in recent years.) In recent years, 
concerned NIH officials have issued statements welcoming proposals for basic research 
studies. An agency spokesperson noted that the gap in number of awards is closer; those 
for applied work have exceeded those for basic research only slightly since 2019. 
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need to be aware of how such atmospheric 
conditions can affect radar propagation. 


M.D. to head U.S. cancer institute 


LEADERSHIP | President Joe Biden has 
tapped Vanderbilt University Medical 
Center physician-scientist Kimryn Rathmell 
to direct the U.S. National Cancer Institute 
(NCI). Rathmell has helped lead large 
projects to sequence tumor genomes, and 
her work on the biology of kidney cancer 
has led to new ways to treat the disease. 
Biden called her a “talented and visionary 
leader” who “embodies the promise of the 
Biden Cancer Moonshot,’ his effort to halve 
the death rate from cancer by 2047. She will 
replace Monica Bertagnolli, who headed 
NCI for just over a year before the U.S. 
Senate confirmed her this month as director 
of its parent agency, the National Institutes 
of Health. 


Census advisers call for review 


DEMOGRAPHICS | AU.S. Census Bureau 
advisory panel last week recommended 
delaying proposed changes to questions 
about disabilities on a key annual survey 
until researchers and advocacy groups 
have time to recommend alternatives. 
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The bureau’s alterations to the American 
Community Survey would lower the cur- 
rent estimate of disability prevalence in 
the United States from 14% to 8%. Many 
researchers and advocates have criticized 
the proposal for its effect on a key data 
source; state and federal programs also 
rely on the data when allocating funding 
and evaluating whether disabled people 
are being given equal opportunities. 
Supporters of the changes say they reflect 
better methods and would harmonize 
international comparisons. 


Bias-free clinical algorithms 


HEALTH DISPARITIES | A broad effort from 
medical researchers, educators, and 
funders is needed to systematically assess— 
ultimately reduce—unequal health effects 
of racially biased medical algorithms, 
according to a white paper last week from 
the Council of Medical Specialty Societies, 
a coalition of more than 50 organizations. 
Although considering race when mak- 

ing medical decisions can be beneficial in 
some cases, including race in algorithms 
that guide treatment could perpetuate 
race-based disparities in health care and 
disadvantage historically marginalized 
populations, the paper says. 
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Frontier, at Oak Ridge National 
Laboratory, can make more than 1.1 billion 
billion computations per second. 


Exascale computers show off emerging science 


World's fastest supercomputers will sharpen climate forecasts and design new materials 


By Robert F. Service, in Denver 


o really understand how a material be- 
haves, researchers need to simulate its 
whirling electrons, which govern most 
of its chemical and electronic proper- 
ties. But they have traditionally faced a 
trade-off. They could simulate up to a 
couple of hundred electrons with near-perfect 
accuracy. Or they could simulate a much 
larger number—while accuracy fell off a cliff. 

The world’s most powerful  super- 
computers, operating at the far frontier of 
speed known as the exascale, have now be- 
gun to eliminate that trade-off. 

At SC23, a supercomputing conference 
here, researchers last week reported simulat- 
ing the behavior of up to 600,000 electrons 
within a microscopic chunk of a magnesium 
alloy with nearly the accuracy of a quantum 
Monte Carlo simulation, the gold standard 
for much smaller numbers of electrons. “We 
broke through the accuracy-length scale bar- 
rier,’ says Sambit Das, a mechanical engineer 
at the University of Michigan and a member 
of the team presenting the work. 

The simulations showed how defects 
form in the alloys, which could open the 
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way to designing novel lightweight alloys 
for fuel efficient cars and airplanes. Apply- 
ing the techniques to materials called quasi- 
crystals, ordered solids that lack repeating 
atomic arrangements, also revealed why 
they take the unusual shapes they do, an 
advance that could lead to novel magnetic 
materials and superconductors. 

In other feats of computation, research- 
ers at SC23 reported efforts to predict air 
flow and noise from a fuel efficient jet 
engine design, and how heat would pulse 
through the core of a small modular nu- 
clear reactor, an advance that could in- 
form safer designs. All are among the early 
results emerging from one of the world’s 
first exascale supercomputers, Frontier, at 
the Department of Energy’s (DOE’s) Oak 
Ridge National Laboratory. Capable of 
1.1 exaflops (1018 flops), or 1.1 billion billion 
operations per second, Frontier is more 
than twice as fast as the fastest machine 
from just 2 years ago. 

These results and others on the way from 
exascale machines coming online in the 
next few years promise to open a new win- 
dow into materials, climate science, biology, 
and medicine. “There is a new science era 
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that is unfolding,” says Ceren Susut, associ- 
ate director for DOE’s Advanced Scientific 
Computing Research program. 

A pair of Chinese supercomputers is 
widely credited as the first to cross the 
exascale threshold in 2022, and research 
results from those machines, such as the 
first global climate model to incorporate 
the cooling effects of specific volcanic erup- 
tions, were presented at the meeting. But 
Chinese officials have declined to share de- 


ry 


tails about their computers and don’t offer . 


access to scientists outside China through 
an open review process. “China is opaque to 
us,’ says Eric Stromaier, who helps assem- 
ble the biannual TOP500 list of the world’s 
most powerful supercomputers. 

That’s left Frontier as the world’s only of- 
ficial exascale supercomputer. Completed 
in May 2022, Frontier opened for general 
scientific use in April. Weighing nearly 
270 tons, Frontier contains more than 40,000 
processors that make it about 1 million 
times more powerful than an average desk- 
top computer. It consumes 21 megawatts 
of power—enough for more than 15,000 
homes—and needs to be cooled with four 
350-horsepower pumps, powerful enough to 
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fill an Olympic-size pool in 30 minutes, that 
continuously circulate water. 

Frontier is expected to be eclipsed within 
weeks by Aurora, a second U.S. exascale be- 
hemoth now completing its final debugging 
phase at Argonne National Laboratory. Even 
though only partially installed, Aurora has al- 
ready weighed in as the world’s second most 
powerful computer and should soon top out 
at more than 2 exaflops. When it opens up 
for scientific proposals, it is expected to guide 
engineers in designing more fuel efficient air- 
planes, aid the quest for green energy cata- 
lysts, and propel efforts to predict patient 
responses to cancer treatments by simulating 
the spread of metastases through the blood- 
stream. El Capitan, a third U.S. exascale ma- 
chine being installed at Lawrence Livermore 
National Laboratory (LLNL), is expected to 
come online in the middle of 2024 and help 
nuclear weapons scientists simulate explo- 
sions from the aging U.S. stockpile. The three 
U.S. machines were supported by $4: billion 
provided to DOE and the National 
Nuclear Security Administra- 
tion, with about half the 
money dedicated to build- 
ing the machines and half 
going to software devel- 
opment and personnel. 

Meanwhile, other 
countries are pressing 
ahead with their own 
exascale efforts. Jupiter, 
an exascale machine in 
Germany, is slated to come 
online at the end of 2024. 
An exascale upgrade to the 
Fugaku supercomputer in 
Japan is planned for 2029. 
And France is currently 
planning to build an exa- 
scale system called Jules Vernes, although a 
release date has yet to be announced. 

The new machines are the culmination 
of the seemingly relentless 1000-fold jumps 
in supercomputing speed and power that 
have occurred every decade or so since the 
early 1990s. But this last leap from petascale 
(10” flops) to exascale required some design 
changes. With Frontier, researchers decided 
to incorporate vast numbers of graphical 
processing units (GPUs), the high-speed 
chips at the heart of gaming consoles, bit- 
coin mining, and artificial intelligence (AI). 
They also soldered 128 gigabits of memory 
onto each GPU chip to reduce the time and 
energy needed to shuffle data back and forth 
between processors and memory chips. 

The payoff was the ability to track events 
not only at ultrahigh resolution, but over 
broader spatial or timescales. “What exascale 
computers give us is an ability to get higher 
resolution for longer periods of time,” says 
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Researchers simulated 40,000 
electrons in a quasi-crystal made of 
2000 ytterbium and cadmium atoms. 


Lori Diachin, principal deputy director of 
computing at LLNL. 

At the meeting, Luca Bertagna, an applied 
mathematician at Sandia National Labora- 
tory, reported how Frontier enabled him and 
his colleagues to sharpen the resolution of 
DOE's global climate model from 100 kilo- 
meters to just 3 kilometers. That allowed the 
model to simulate the fine-scale atmospheric 
processes that give rise to clouds, which in 
coarser models have to be estimated. Because 
the behavior of clouds in a warming world 
represents one of the biggest uncertainties in 
climate change, the higher resolution should 
help researchers sharpen their predictions 
of how rising greenhouse gas concentrations 
will warm the planet, says John Taylor, a high 
performance computing expert at Australia’s 
Commonwealth Scientific and Industrial Re- 
search Organisation. 

Having conquered the exascale, research- 
ers are already eyeing the next leap in super- 
computing: zettascale (107! flops) machines. 
But its going to be hard, says 
Christine Chalk, program man- 
ager for DOE’s Exascale Com- 
puting Project (ECP). “All 
the low-hanging fruit is 

gone.” The biggest issue, 

hardware experts say, 

is that the decadeslong 

trend of steady shrink- 

age of transistors and 

other computing devices, 
known as Moore’s Law, has 
slowed considerably in re- 
cent years. 

One popular idea for 
squeezing more processing 
power out of current de- 
signs is to relax the math- 
ematical precision with 
which current computer chips make calcula- 
tions, a change that could lead to a 10-fold 
improvement in processing power. But such 
a change could allow any errors to com- 
pound, undermining reliability. Other ideas 
include creating hybrid machines that would 
incorporate still emerging technologies such 
as quantum computers and systems tailored 
for machine learning and AI. “I don’t know 
what’s coming next, but I hope it will give us 
another boost,’ says Jack Dongarra, a high 
performance computing expert at the Uni- 
versity of Tennessee, Knoxville. 

But all that would take a new influx of 
money, which doesn’t appear to be in the off- 
ing. With its work largely complete, DOE’s 
ECP is due to sunset next month, which could 
leave hundreds of computer scientists and 
engineers out of a job. “The fear is that talent 
will leave the DOE and go to companies like 
NVIDIA, Microsoft, and Facebook,’ Dongarra 
says. “It’s a very hard thing to replace.” m 
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Nocturnal 
habits may 
help animals 
survive crises 


After ancient extinctions, 
survivors switched to day 
shifts, study of fish suggests 


By Elizabeth Pennisi 


bout 145 million years ago, volcanoes 
erupted all over Earth, darkening 
skies and snuffing out thousands 
of species. One group that vanished 
was the Mesturidae, deep-bodied 


fish with powerful teeth for crush- ; 


ing coral. But some fish that swam in the 
same waters, such as the pointy-snouted 
Acipenseriformes, survived the upheaval, 
and later evolved into today’s sturgeons. 
Evolutionary biologists have long de- 
bated why some species survived while 
countless others died in this and other ex- 
tinctions. Now, one group of researchers 
proposes that being active at night—as stur- 
geons are—confers a survival advantage. 
The model, which inferred the habits of 
ancient fish based on the behaviors of thou- 
sands of living species, suggests that after 
the tough times passed, the nocturnal spe- 


ny 


cies quickly diversified, with some shifting ‘ 


into day dwellers and replacing the species 
gone missing. 

Researchers already knew that many 
nocturnal mammals survived the mass ex- 
tinction that followed an asteroid strike 
66 million years ago, whereas the dinosaurs, 
which were largely diurnal, did not (except 
for birds). The new study extends that pat- 
tern to other catastrophes and other species, 
including aquatic ones. It represents “one 
of the largest tests of the role of behavior 
in mass extinctions,’ says Pincelli Hull, an 
oceanographer at Yale University who stud- 
ies mass extinctions and wasn’t part of the 
research. “This is really quite exciting.” 

“I’ve never thought about this before,’ 
adds Prosanta Chakrabarty, an ichthyo- 
logist at Louisiana State University. But in 
part because the idea is so novel, “I remain 
skeptical,” he says. A diver who often visits 
the seas at night, he says it’s tough to tell 
whether a fish is active in the dark. 
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Most fossils reveal little about an organ- 
ism’s behavior over a 24-hour period. So 
Maxwell Shafer, an evolutionary biologist 
at the University of Toronto, and his col- 
leagues at the University of Basel focused 
on living fish, which represent half of all 
vertebrates. They combed the literature to 
determine the day-night behaviors of al- 
most 4000 bony fish species and 135 carti- 
laginous ones, such as sharks, and plotted 
these behaviors on a fish tree of life. Then 
they carried out many computer simula- 
tions of possible day-night activity patterns 
in the ancestors of modern fish, until they 
came across the one that best reproduced 
the day-night patterns of today’s species. 

It’s an impressive feat, says Roi Maor, an 
evolutionary ecologist at the Royal Botanic 
Gardens, Kew. “This work stretches, perhaps 
to the limit, the power of current evolution- 
ary modeling techniques” to figure out what 
happened millions of years ago, he says. 

Some fish, like the sturgeon, have re- 
mained nocturnal throughout their history. 
According to their modeling, the ancestor 
of all fish was also nocturnal, the team re- 


as mammals did after the dinosaurs were 
gone. Among fish, this eventually restored a 
balance between nocturnal and diurnal spe- 
cies, the team found. 

Previous studies had suggested early 
mammals were nocturnal to avoid dino- 
saurs. After those predators disappeared, 
mammals switched to being active during 
the day. Amphibians and other land verte- 
brates also tended to be nocturnal through- 
out much of their evolution, according to 
other studies, but more and more have be- 
come diurnal. Shafer’s team proposes that 
for all vertebrates, nocturnality was a sur- 
vival advantage during catastrophes. 

Others remain cautious about the claims. 
“The key conclusion of the study—that 
nocturnality conferred an_ evolutionary 
advantage—is supported by the evidence, 
[but] there could be more than one reason for 
such a survival pattern,’ Maor says. He and 
Hull point out that when Earth went dark 
66 million years ago, diurnal predators 
couldn’t see to catch food and plants died 
off, dooming herbivores. Many nocturnal 
animals forage on detritus, which didn’t dis- 


Modern sturgeon, like their ancestors, hunt prey at night, which may have helped the group survive extinctions. 


ported on 1 November on bioRxiv, as were 
many of the fish species that survived mass 
extinctions. The team also found that fish 
“have transitioned many times between 
being nocturnal and diurnal’—more often 
than other vertebrates. 

Shafer found those shifts were most evi- 
dent in the wake of periods of elevated ex- 
tinction 145 million and 66 million years 
ago. Temperatures spiked during both 
events, and he proposes that by coming out 
only in the dark, nocturnal animals avoided 
potentially lethal daytime peaks. Once the 
environmental upheavals were over, noctur- 
nal species could exploit now-empty niches 
by shifting to diurnality and diversifying, 
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appear as quickly. The advantage “could be 
something about the feeding ecology,” rather 
than nocturnality per se, agrees Jonathan 
Payne, a paleontologist at Stanford University. 

Payne praises how the researchers used 
relationships among living species to de- 
duce behaviors of extinct ones. But, “I 
wouldn’t say that I’m incredibly convinced” 
about some of the conclusions, he says. 

Even so, Haijun Song, a paleontologist at 
the China University of Geosciences, thinks 
Shafer’s effort could offer a glimpse of the 
future, as our world faces upheaval from 
climate change. “This article gives a reason- 
able prediction that nocturnal animals will 
be more likely to survive.” & 
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Can Al help 
scientists surf 
a paper flood? 


Technical and legal barriers 
may hinder widespread use 


By Jeffrey Brainard 


hen Iosif Gidiotis began his doc- 

toral studies in educational tech- 

nology this year, he was intrigued 

by reports that new tools powered 

by artificial intelligence (AI) could 

help him digest the literature in 
his discipline. With the number of papers 
burgeoning—across all of science, close to 
3 million were published last year—an AI re- 
search assistant “sounds great,” says Gidiotis, 
who is studying at the KTH Royal Institute 
of Technology. He hoped AI could find more 
relevant papers than other search tools and 
summarize their highlights. 

He experienced a bit of a letdown. When 
he tried AI tools such as one called Elicit, he 
found that only some of the returned papers 
were relevant, and Elicit’s summaries weren’t 
accurate enough to win him over. “Your in- 
stinct is to read the actual paper to verify 
if the summary is correct, so it doesn’t save 
time,” he says. (Elicit says it is continuing to 
improve its algorithms for its 250,000 regular 
users, who in a survey credited it with sav- 
ing them 90 minutes a week in reading and 
searching, on average.) 

Created in 2021 by a nonprofit research or- 
ganization, Elicit is part of a growing stable 
of AI tools aiming to help scientists navigate 
the literature. “There’s an explosion of these 


platforms,” says Andrea Chiarelli, who follows . 


AI tools in publishing for the firm Research 
Consulting. But their developers face chal- 
lenges. Among them: The generative systems 
that power these tools are prone to “halluci- 
nating” false content, and many of the papers 
searched are behind paywalls. Developers are 
also looking for sustainable business models; 
for now, many offer introductory access for 
free. “It is very difficult to foresee which AI 
tools will prevail, and there is a level of hype, 
but they show great promise,” Chiarelli says. 
Like ChatGPT and other large-language 
models (LLMs), the new tools are “trained” 
on large numbers of text samples, learning 
to recognize word relationships. These as- 
sociations enable the algorithms to sum- 
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marize search results. They also identify 
relevant content based on context in the 
paper, yielding broader results than a 
query that uses only keywords. Building 
and training an LLM from scratch is too 
costly for all but the wealthiest organiza- 
tions, says Petr Knoth, director of CORE, 
the world’s largest repository of open- 
access papers. So Elicit and others use ex- 
isting open-source LLMs trained on a wide 
array of texts, many nonscientific. 

Some of the tools go further. Elicit, for ex- 
ample, organizes papers by concept. A query 
about too much caffeine results in separate 
sets of papers about reducing drowsiness 
and impairing athletic performance. A pre- 
mium version, which costs $10 per month, 
uses additional, in-house programming to 
boost accuracy. 

Another tool called Scim helps draw the 
reader’s eye to a paper’s most relevant parts. 
A feature of the Semantic Reader tool cre- 
ated by the nonprofit Allen Institute for AI, 
it works like an automated ink highlighter, 
which users can customize to apply differ- 
ent colors to statements about novelty, objec- 
tives, and other themes. It provides “a quick 
diagnostic, a triage, about whether [a paper] 
is worth engaging with,” which “is very valu- 
able,” says Eytan Adar, an informational sci- 
entist at the University of Michigan who tried 
out an early version before an expanded one 
was unveiled last month. Several of the tools 
also annotate summaries with excerpts from 
papers on which they are based, allowing us- 
ers to judge the accuracy for themselves. 
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To try to avoid generating false responses, 
the Allen Institute operates Semantic Reader 
using a suite of LLMs, including ones trained 
on scientific papers. But the effectiveness of 
this approach is difficult to measure. “These 
are hard technical problems at the periphery 
of our understanding,” says Michael Carbin, 
a computer scientist at the Massachusetts In- 
stitute of Technology who helped develop an 
algorithm to summarize medical literature. 
According to Dan Weld, chief scientist at the 
Allen Institute’s Semantic Scholar repository 
of papers, “Right now, the best standard we 
have is to have a very educated human look at 
[the AI output] and carefully analyze it.” The 
institute has gathered feedback from more 
than 300 paid graduate students and thou- 
sands of volunteer testers. Quality checks re- 
vealed that applying Scim to non-computer 
science papers produced glitches, so the insti- 
tute is currently offering Scim for only about 
550,000 papers in computer science. 

Other researchers emphasize that the AI 
tools will only reach their potential if devel- 
opers and users can access papers’ full text to 
inform search results and analysis of content. 
“Tf we can’t access the text, then our view of 
the knowledge that’s captured in those texts is 
limited,’ says Karin Verspoor, a computational 
linguist at RMIT University, Melbourne. 

Even Elsevier, the world’s largest scien- 
tific publisher, limits its AI tools to papers’ 
abstracts. In August, the commercial firm 
debuted an Al-assisted search feature in its 
Scopus database, whose listings of 93 million 
research publications make it one of the larg- 
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Artificial intelligence tools promise to help 
researchers digest scholarly literature in new ways. 


est for scientists. In response to a query, its al- 
gorithms identify the most relevant abstracts 
and use a version of ChatGPT to provide an 
overall summary. (The tool restructures user 
queries to reduce the fabricated responses 
ChatGPT sometimes delivers.) Scopus AI 
also groups the abstracts by concept. The 
abstracts-only approach is consistent with 
the terms of Elsevier’s licensing agreements 
with other publishers that allow their pa- 
pers’ abstracts to be listed in Scopus, says 
Maxim Khan, senior vice president for ana- 
lytics products and data platforms at Elsevier. 
For now, users tell Elsevier, that approach is 
sufficient for “[helping] researchers in cross- 
disciplinary fields trying to get their head 
around a particular topic quickly,” he says. 

The Allen Institute has taken a different 
approach: It negotiated agreements with 
more than 50 publishers that allow its devel- 
opers to data mine the full text of paywalled 
papers. Weld says almost all the publishers 
have offered access at no cost because the 
AI drives traffic to them. Even so, licensing 
restrictions limit Semantic Reader users to 
accessing the full text of only 8 million of Se- 
mantic Scholar’s 60 million full-text papers. 
And Knoth says such negotiations are prohib- 
itively time-consuming for his organization. 
“Tt can hardly be seen as a fair, level playing 
field,’ says Knoth, whose university-funded 
repository works to develop tools to help sci- 
entists explore its content. 

Enabling data mining on a broad scale 
will also require getting more authors and 
publishers to adopt non-PDF formats that 
help machines efficiently digest a paper’s 
contents. A White House directive in 2022 
requires that papers produced with federal 
funding be machine readable, but agencies 
have yet to propose details. 

Despite the challenges, computer scientists 
are already looking to develop more sophisti- 
cated Als, able to glean even richer informa- 
tion from the literature. They want to harvest 
clues to enhance drug discovery and continu- 
ally update systematic reviews. Research sup- 
ported by the Defense Advanced Research 
Projects Agency has explored systems able to 
automatically generate scientific hypotheses, 
by identifying gaps in existing knowledge as 
revealed by published papers. 

But for now, scientists using AI tools need 
to maintain a healthy level of skepticism, says 
Hamed Zamani of the University of Massa- 
chusetts, Amherst, who studies interactive 
information-access systems. LLMs “will defi- 
nitely get better. But right now, they have a 
lot of limitations. They provide wrong infor- 
mation. So scientists should be very aware of 
that, and double check their output.” 
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AGRICULTURE 


Fern proteins show promise 
against crop pests 


These and other ancient plants may provide alternatives 


to chemical insecticides 


By Elizabeth Pennisi 


he pretty ferns that adorn windowsills 

and gardens have some surprising 

powers. Biologists have long known 

that this ancient group of plants wards 

off hungry insects better than other 

flora, and now theyre homing in on 
why. They’ve discovered fern proteins that 
kill and deter pests, including, most recently, 
one that shows promise against bugs resis- 
tant to widely used natural pesticides. 

The new protein, described last month in 
the Proceedings of the National Academy of 
Sciences (PNAS), adds to a growing arsenal 
that could one day provide a fresh alterna- 
tive to chemical insecticides. “These proteins 
have great potential and may represent a 
new mode of pesticide action,” says Juan Luis 
Jurat-Fuentes, an entomologist at the Univer- 
sity of Tennessee, Knoxville. They are excit- 
ing, says Kristina Sepci¢, a biochemist at the 
University of Ljubljana, because they “have 
proven to be active against insect [popula- 
tions] resistant to certain bacterial toxins.” 

Since the late 1930s, proteins isolated from 
a soil bacterium called Bacillus thuringensis 
(Bt) have become a mainstay of natural pest 
control. They were first used as an insecti- 
cidal spray, but more recently scientists en- 
gineered genes for these proteins into crops. 
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Farmers around the world planted more 
than 100 million hectares of these transgenic 
plants in 2019. 

Transgenic corn and cotton alone saved 
growers more than $50 billion in lost crops 
in the first 2 decades of their use, according 
to Corteva Agriscience. Bt pest control also 
brought environmental benefits, reducing 
the use of organophosphate insecticides and 
other toxic chemicals. 

But it may not be working as well as it 
used to. When Bruce Tabashnik, an entomo- 
logist at the University of Arizona, reviewed 
25 years of data on corn, sugarcane, cotton, 
soybeans, and other Bt crops from seven 
countries, he found signs that populations of 
11 pest species have evolved substantial re- 
sistance to the proteins. Cases of resistance 
jumped from three in 2005 to 26 in 2020, he 
and his colleagues reported in April in the 
Journal of Economic Entomology. That trend 
is continuing, Jurat-Fuentes says. Tabashnik 
has found 17 additional instances where pests 
were becoming resistant. 

“This is of great concern,’ says Marilyn 
Anderson, a biochemist at La Trobe Uni- 
versity. “We do not want to return to heavy 
use of chemical insecticides.” She is among a 
small group of scientists eyeing fern proteins 
as an alternative. In the wild, these ancient 
plants, which evolved long before the plants 
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The Cretan brake fern contains insecticidal prote upd: 


that can deter corn and soybean pests. 


now used as crops, often seem unaffected 
by insects. 

In the 1990s, researchers sprayed crops 
with fern extracts, with mixed results. Oth- 
erwise, ferns and other non-seed producing 
plants got little attention as possible insect 
killers. Then, in 2016, researchers from India 
inserted a gene from a halberd fern (genus 
Tectaria) into cotton, hoping to fight sap- 
sucking whiteflies. Because no other natural 
insecticides had ever worked against this 
pest, says P.K. Singh, a plant biotechnolo- 
gist at the CSIR-National Botanical Research 
Institute, “We thought to explore non- 
obvious and unrelated sources for insecti- 
cidal activity.” 

The halberd fern gene protected the cot- . 
ton from whiteflies and other sucking pests, 
and Singh has now isolated other fern com- 
pounds that deter chewing insects, such as 
caterpillars. He says his team has engineered 
the corresponding genes into cotton and seen 
very “interesting” and “promising” results in 
field studies. 

Evidence that ferns might harbor use- 
ful insecticides also emerged from a col- 
laboration between Corteva and Anderson’s 
company, Hexima. Starting 8 years ago, 
Anderson’s team examined 10,000 Austra- 
lian plants, testing extracts against pest 
insects in the lab and exposing them to 
digestive enzymes to determine whether 
they’d likely break down in the human 
gut and therefore be safe to use on crops. 
Corteva, meanwhile, screened plants from 
North America and elsewhere. Both teams 
looked for proteins with a novel mechanism 
that could replace Bt, Anderson says. 

In 2019, Corteva reported that genes for 
proteins found in maidenhair ferns could 
protect soybeans from soybean looper and 
velvetbean caterpillars, and since then both 
groups have sharpened their focus on ferns. 
“We have since discovered several families 
of insecticidal proteins from these plants,’ . 
Corteva said in a statement. They don’t yet 
know exactly how these proteins work. 

In the recent PNAS paper, a team includ- 
ing Anderson and the Corteva scientists 
report the latest potential weapon against 
pests: a protein from Pteris cretica cv. Al- 
bolineata, sometimes called Cretan brake 
fern, ribbon fern, or table fern, which is a 
common houseplant native to Europe, Asia, 
and Africa. In the lab, extracts of the fern 
stunted the growth of soybean looper and 
corn earworm. Distant relatives of the fern 
have variants of this protein, the researchers 
discovered, indicating it arose early in fern 
evolution, about 300 million years ago. They 
dubbed this group of proteins IPD113. 
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Co-author Megan Maher, a structural bi- 
ologist at the University of Melbourne, and 
colleagues solved the structure of one vari- 
ant. They found that it resembles the Bt pro- 
teins used as insecticides, except it has just 
two major active parts, whereas Bt proteins 
have three. Bt proteins work by puncturing 
the insect gut. The researchers think the 
fern proteins do, too, but because the active 
part missing in fern proteins is the one Bt 
proteins use to bind receptors on the cell 
membranes, the fern proteins may bind dif- 
ferent receptors. “The hope is the new fern 
proteins can be Goldilocks insecticides— 
similar enough to Bt to be safe and effec- 
tive yet different enough to kill insects that 
evolved resistance to Bt,’ Tabashnik says. 

When the Corteva team transferred the 
genes for the most effective IPD113 versions 
into maize, leaf damage from key pests such 
as fall armyworm and corn earworm fell to 
at most 30% compared with more than 50% 
in unmodified maize. The fern proteins also 
worked against insect strains resistant to Bt 
proteins. The paper “is an excellent advance 
and establishes ferns as a repertoire of new 
molecules,’ Singh says. 

These successes will likely attract inter- 
est from other research groups, says Georg 
Jander, a chemical ecologist at the Boyce 
Thompson Institute. And he thinks other 
companies are quietly casting an even 
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In lab tests, corn earworms usually devour 
maize leaf samples (right), but one carrying a newly 
identified fern protein was protected. 


wider net for new insecticidal proteins. 
Jander and Boyce Thompson fern bio- 
logist Fay-Wei Li, for example, are looking 
into defense compounds of primitive plants 
called liverworts. And Sep¢éi¢ is evaluating 
mushroom-derived compounds that kill 
insects by a different mechanism. Instead 
of binding protein-based receptors on the 
cell membrane as the fern proteins do, they 
bind the lipids the membranes are made of. 
Because these lipids are conserved across 
the tree of life, Sepci¢ thinks insects will 
not easily evolve resistance. 

If such compounds prove effective against 
Bt-resistant pests, proteins from some of the 
earliest land organisms may help ensure the 
future of food security. 
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INFECTIOUS DISEASE 


Novel coronavirus blamed 
for Cyprus cat deaths 


Coopted dog virus sequences may have boosted the strain 


By Catherine Offord 


hen thousands of cats started to 

get sick and die in Cyprus this 

year, the crisis made interna- 

tional news. Symptoms such as 

fever, a swollen belly, and leth- 

argy pointed to feline infectious 
peritonitis (FIP), a common condition 
caused by a cat coronavirus—but scientists 
struggled to explain the surge in cases. 
Now, researchers have identified a possible 
culprit: a novel feline coronavirus that 
has borrowed key RNA sequences from a 
highly virulent dog pathogen called pan- 
tropic canine coronavirus (pCCoV). The 
findings, posted on 9 November as a pre- 
print on bioRxiv, could help explain how 
severe illness spread so widely. 

“They’ve done a great job in identify- 
ing what looks to be a very interesting and 
concerning virus,” says Gary Whittaker, a 
virologist at the Cornell University College 
of Veterinary Medicine. Although canine- 
feline coronavirus crossovers have been re- 
ported before, this is the first documented 
case of a cat coronavirus combining with 
pCCoV, apparently leading to a “perfect 
storm of both disease and transmissibility.” 

Veterinarians in Cyprus raised the alarm 
early this year about increased cases of FIP, 
which is not related to COVID-19 and does 
not affect humans. By July, media outlets re- 
ported nearly 300,000 cat deaths, though lo- 
cal veterinarians revised that to about 8000. 
In August, the Cypriot government green- 
lighted the veterinary use of SARS-CoV-2 
medication molnupiravir, which blocks 
coronavirus replication and appears to be 
an effective treatment for FIP. 

The explosion in cases presented a puz- 
zle. Most feline coronaviruses infect the 
gut, causing mild infections that don’t esca- 
late to FIP. Strains sometimes mutate into 
a form called FIP virus (FIPV) that infects 
immune cells and triggers serious disease. 
But unlike intestinal strains, which spread 
easily through feces, FIPV typically isn’t 
transmitted between cats. 

To learn more, researchers ran RNA se- 
quencing on fluid from the abdomens and 
spines of sick cats in Cyprus. They found a 
previously undescribed feline coronavirus, 
which they dubbed FCoV-23, that contains 
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a chunk of RNA from the dog virus pCCoV. 
(The “pantropic” in its name means that, 
unlike regular intestinal canine corona- 
viruses, pCCoV infects many tissues.) 

FCoV-23 seems to have arisen when a fe- 
line coronavirus encountered pCCoV in an 
unidentified animal host and coopted the 
latter’s spike protein—the structure corona- 
viruses use to gain access to host cells. 
Study co-author Christine Tait-Burkard, a 
virologist at the University of Edinburgh’s 
Roslin Institute, says this and other genetic 
tweaks may have allowed FCoV-23 to cause 
FIP while still infecting the intestines and 
spreading through feces. The team specu- 
lates that the spike protein changes could 
also help stabilize FCoV-23 outside an ani- 
mal host, increasing chances of transmis- 
sion via contact with contaminated feces. 
It’s unclear how far FCoV-23 has spread, 
though the team identified one case in the 
United Kingdom in a cat imported from 
Cyprus. The general risk to cats outside the 
island remains low, Tait-Burkard says. 

Margaret Hosie, a virologist at the Uni- 
versity of Glasgow, says that although it’s 
exciting to see virological data emerging 
from the Cypriot population, there remain 
many open questions. More data are needed 
to confirm FCoV-23 is transmitted directly 
among cats through feces, she says. 

Increased awareness could explain some 
of the apparent rise in FIP cases this year, 
she adds. Without historical case numbers, 
“we can’t say there’s been a huge outbreak.” 
Feline coronaviruses and pCCoV have co- 
existed in the Mediterranean region for 
years, so it’s possible that the genetic cross- 
over happened some time ago. 

Tait-Burkard and colleagues are collabo- 
rating with researchers in Cyprus to test lo- 
cal cats for FCoV-23 and get better estimates 
of its prevalence and fatality rate. They also 
want to investigate whether unique fea- 
tures of FCoV-23 explain the resulting dis- 
ease’s apparently high rate of neurological 
symptoms—twice that of typical FIP. 

In the meantime, the discovery of this 
mixed cat-dog coronavirus highlights the im- 
portance of taking a cross-species approach 
to understanding viral evolution, Whittaker 
says. “This feline coronavirus has got huge 
potential for us to understand what goes on 
in general in coronavirus virology.” 
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By Gunjan Sinha 


oubts about work claiming that cer- 

tain molecules actively quell inflam- 

mation in the body have intensified. 

Critics have doubled down on their 

objections with a recent paper, and 

Science has learned that although a 
university publicly exonerated one of the 
researchers involved of misconduct, it ap- 
pears to have at one point reached a differ- 
ent conclusion. 

The furor surrounds pioneering work 
on lipids known as specialized proresolv- 
ing mediators (SPMs), done by 
Brigham and Women’s Hospital 
biochemist Charles Serhan and 
his former postdoc Jesmond Dalli, 
a molecular pharmacologist now 
at Queen Mary University of Lon- 
don (QMUL). More than a dozen 
scientists who have analyzed SPM 
research by Dalli, Serhan, and oth- 
ers formally published a critique 
of the work in Nature Communi- 
cations this month. The analysis, 
which appeared last year as a pre- 
print, attacks the method used to 
detect SPMs in many papers for 
producing misleading results. In 
the same journal, Dalli and a col- 
league attacked the critique for 
several reasons, including a “mis- 
representation” of his group’s cri- 
teria for detecting SPMs. 

The critics’ concerns had 
sparked misconduct investigations by QMUL 
and Harvard Medical School (HMS), of 
which Serhan’s hospital is an academic affili- 
ate. This summer, in response to an inquiry 
from Science, QMUL said it had cleared Dalli 
of wrongdoing. “There were absolutely no 
findings of data falsification or fabrication 
of data?’ a QMUL spokesperson wrote in 
an email. The school subsequently posted a 
statement online saying HMS had found no 
misconduct by Serhan, who identified some 
of the first SPMs in 1984. (HMS says it “does 
not comment on individual circumstances 
and specific cases.”) 

But a source at QMUL familiar with its 
inquiry tells Science that the university’s re- 
search investigative panel did find evidence 
of misconduct. Science has also obtained an 
email sent by the QMUL vice principal of re- 
search and innovation stating that the panel 
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“upheld all the allegations” against Dalli and 
a colleague at the school and that the mat- 
ter “is now progressing to QMUL’ disciplin- 
ary process.” Furthermore, a spokesperson 
for the Wellcome Trust, a funder of Dalli’s 
research, tells Science it “has put in place 
some conditions in relation to any future 
applications from Professor Dalli”” QMUL 
declined to comment on how its public 
statements and the email could be recon- 
ciled and did not share the full findings of 
its investigation. 

Research on SPMs has burgeoned in re- 
cent years, driven in part by the possibility 


An illustration shows a dying neutrophil (purple) releasing inflammation- 
resolving lipids (yellow) as a macrophage (blue) starts to clear its remains. 


that they could lead to drugs able to shut 
down damaging inflammation. But last 
year, Science reported on the increasing 
skepticism that SPMs resolve inflamma- 
tion in the body (6 May 2022, p. 565). In the 
peer-reviewed analysis published last week, 
a group of 15 researchers said the mass 
spectrometry method used to detect SPMs 
in many of Dalli’s papers “artifactually [de- 
tects] lipids where none exist.” 

Many of the questions about Serhan and 
Dalli’s papers concerned what the two re- 
searchers have called “illustrations” but 
other scientists have interpreted as real 
data displays. In a comment on PubPeer, 
Dalli wrote: “Their scope is simply to illus- 
trate the presence of molecules of interest 
in as much as an illustration of a mouse in 
a figure is not reporting the mouse that was 
used in an experiment but rather denoting 
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Doubts renewed about anti-inflammation claims 


Did university misconduct probe truly clear work on inflammation-stopping lipids? 


that mice were used.” QMUL’ investigation 
into Dalli took note of the illustrations, the 
school spokesperson emailed Science, but 
they called any assumption that they were 
actual data a “misinterpretation” that “has 
been dealt with in correspondence with 
journals and senior authors.” 

Some experts in mass _ spectrometry 
remain dissatisfied. “Illustrations were 
claimed to be real data in a large number of 
publications, and this is absolutely not com- 
mon practice,” says lipid biochemist Valerie 
O’Donnell at Cardiff University, an author 
of the Nature Communications critique. 
“Moreover, the same artwork 
posed as different samples across 
13 different papers.” 

Neither Serhan nor Dalli has 
shared raw data that might help 
resolve the doubts, adds Garret 
FitzGerald of the University of 
Pennsylvania, who is also an au- 
thor of the critique, even though 
they stated on PubPeer that they 
would. During peer review of the 
Nature Communications paper 
the journal requested raw data. 
QMUL and Dalli refused, citing 
reasons such as patient confiden- 
tiality, according to FitzGerald. 
“We tried to support Dalli and his 
institution to provide anonymized 
raw data, but for unknown rea- 
sons this was impossible,” he says. 

One former Dalli co-author 
who has gained access to raw 
data contends they are flawed. Derek 
Gilroy, an immunology researcher at Uni- 
versity College London who is not an author 


on the Nature Communications paper, had . 


independent lipid experts reanalyze data 
Dalli provided for a 2017 study on which 
Gilroy collaborated. They found no SPMs in 
the samples, contrary to the paper’s initial 
findings. Gilroy has written a correction, 
available as a preprint, and has submitted 
it to the journal. “It’s upsetting to be made 
to believe in a story that the wider scien- 
tific community simply doesn’t agree with,” 
he says. 

Dalli, however, quickly posted a rebuttal 
preprint. Neither he nor Serhan responded 
to Science’s requests for comment. 


Gunjan Sinha is a freelance science journalist living 
in Berlin. 
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Jack McCann survived a case of Japanese encephalitis during the first-ever outbreak in temperate Australia. 


RUDE 
AWAKENING 


The appearance of a “tropical” mosquito-borne illness in 
southeastern Australia has unsettled researchers 


By Meredith Wadman 


onstruction supervisor Jack 
McCann started to feel “a bit 
crook”—that’s “sick,” in Australian 
slang—on the hot afternoon of 
26 February 2022. He and some 
buddies had just finished laying 
a fireplace hearth in his backyard 
in Corowa, Australia, population 
5500. His friends suggested a trip 
to the pub. McCann, then 30, told them he 
needed to beg off. “Usually, I would have 
been the first one there,” he says. 

Corowa sits beside the Murray River, 
which in that region forms the border be- 
tween the states of New South Wales (NSW) 
and Victoria, to the south. It’s a scenic 
area that draws tourists to fish, boat, and 
swim every summer. It’s also rich with wet- 
lands that make ideal mosquito breeding 
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grounds. The river slides along slowly about 
300 meters from McCann’s front door. 

McCann went to bed unusually early that 
night. He woke the next morning drenched 
in sweat and vomited his breakfast. He 
rarely got sick, and he wasn’t a complainer, 
but he asked his partner to take him to 
the small local hospital. When his mother, 
Jo McCann, a nurse, visited him 24 hours 
later, “I was absolutely horrified by his 
appearance,” she recalls. “He was moan- 
ing. He was pale. He was photophobic. He 
just looked terrible.” She asked the charge 
nurse, “How do I get him transferred?” 

By the time Jack McCann arrived by 
ambulance at the regional Albury Hospi- 
tal the next day, he was struggling to put 
sentences together and had developed a 
headache so bad, he recalls, “it felt like my 
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brain was trying to pop out of my head.” 

To infectious disease physician Sam 
Thorburn, McCann’s headache, fever, and . 
confusion pointed to encephalitis, inflam- 
mation of the brain. The white blood cells 
found in his spinal fluid were consistent 
with that condition. But those weren’t the 
only clues Thorburn had. McCann was the 
fourth patient in as many weeks admitted 
to Albury with encephalitis. Like McCann, 
the three others had turned up feverish 
and confused. One, age 46, had improved 
enough to be discharged. The other two, 
ages 61 and 75, had descended into comas 
and required ventilation. 

A slew of tests had failed to reveal the 
underlying cause of encephalitis in any of 
the three patients. They didn’t have herpes 
simplex infection, a dangerous but treatable 
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cause of encephalitis, or HIV, which can 
cause similar symptoms. And it wasn’t cryp- 
tococcus, a fungus that infects the brains of 
people with weakened immune systems. 

But also prominent among the pos- 
sible causes were mosquito-borne viruses, 
in particular two encephalitis-causing 
viruses endemic to Australia: Kunjin, a 
strain of West Nile virus, and Murray Val- 
ley encephalitis virus (MVEV), named for 
the river valley where McCann has swum, 
water skied, fished, and boated since he 
was a boy. 

Endemic arboviruses (an abbreviation 
for arthropod-borne virus) haven’t caused 
huge epidemics in Australia because they 
aren’t transmitted between humans. But 
certain wild birds and other animals can 
harbor and amplify them, churning out 
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hundreds of millions of copies after being 
bitten by an infected mosquito. When an 
animal stuffed with virus is bitten again, 
the mosquito can easily ferry the virus to 
unsuspecting humans nearby. 

The vast majority of people infected 
with an arbovirus in Australia develop 
only mild symptoms, or none at all. But 
some become very sick. They can be left 
with chronic disabilities. And some die. 

Thanks to a La Nifia weather pattern, 
the spring month of November 2021 was 
NSW’s wettest since record-keeping began 
in 1900, and it ushered in an exceptionally 
rainy summer. Around Albury, the numbers 
of mosquitoes were “wild, like nothing we 
had seen before,” Thorburn recalls. But the 
tests of the first three patients’ blood had 
come back negative for Kunjin and MVEV, 
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The Murrumbidgee River in New South Wales in 
December 2021. Record-breaking rain flooded rivers 
and created ideal mosquito breeding grounds. 


and another less likely possibility present in 
northern Australia: dengue. A month after 
the first patient was admitted, Thorburn 
still had no diagnosis. 

Then, on the evening of 27 February, 
while McCann was still in the tiny hospi- 
tal in Corowa, Thorburn read a message 
posted a day earlier on Ozbug, a listserv of 
Australian infectious disease physicians. It 
began: “Dear colleague, Today, health de- 
partments have alerted physicians ... to the 
detection of Japanese encephalitis virus in 
pigs in Victoria, NSW and Queensland.” 
Commercial piggeries, Thorburn knew, 
dotted the region the hospital serves. 
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He quickly texted the link to a colleague, 
adding: “Dudeeee. This is guna be what it is.” 

“OMG, the colleague texted back. 

The next day, Thorburn sent the three 
patients’ blood for Japanese encephalitis vi- 
rus (JEV) antibody tests at the nearest state 
department of health lab, in Melbourne, 
adding McCann’s when he was transferred 
from Corowa later that day. Over the next 
several days, all four patients’ tests would 
come back positive. 

Thorburn was stunned. He knew that 
people were sometimes diagnosed in Aus- 
tralia after returning from tropical coun- 
tries such as Indonesia and Papua New 
Guinea where Japanese encephalitis, the 
disease caused by JEV, is endemic. But only 
one person on the mainland had ever been 
diagnosed with the disease—and that was 
2900 kilometers north of Albury, in tropi- 
cal Queensland, in 1998. Finding this fre- 
quently fatal or debilitating disease in the 
country’s temperate southeastern quadrant 
was unheard of. 

When Linda Hueston, principal scientist 
in the arboviruses and emerging diseases 
unit at NSW Health Pathology, ran the first 
test of a patient sample, she saw “some- 
thing that can’t possibly be here because 
we don’t have JEV in southeast Australia.” 
Assuming her test had failed, Hueston ran 
it again. And again. “And then you discover 
there’s nothing wrong with your tests,” she 
recalls. “I thought: ‘Why here? Why now? 
What next?’” 

Answers to Hueston’s questions remain 
elusive. But mosquitoes, waterbirds, and 
domestic pigs have all emerged as key links 
in a chain of virus transmission and amplifi- 
cation, with weather—those unprecedented 
rains—as a driving force. 

The outbreak would continue for months, 
eventually sickening 45 people, killing six of 
them, and infecting countless others. Yet its 
toll barely registers on the scale of damage 
done globally by JEV, especially in South- 
east Asia. 

“Diseases that mainly affect rural popula- 
tions in lower and middle-income countries 
do not get story lines unless something like 
this happens,” says Sean Moore, an infec- 
tious disease modeler at the University of 
Notre Dame. Still, the outbreak may por- 
tend a worrisome expansion of JEV’s range, 
he says, and that deserves attention in an 
era of climate change. 

The unexpected emergence of the dis- 
ease also exposed major gaps in research- 
ers’ knowledge. “What I was surprised 
about is how little we understand about 
mosquito populations in Australia,” says 
Rebecca Rockett, a molecular virologist 
at the University of Sydney. “If you actu- 
ally want to stop or prevent these out- 
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Hot zones 


Areas dense with piggeries in southeastern Australia overlap with those most likely to harbor mosquitoes 
that can carry Japanese encephalitis virus (JEV). Normally found in the tropics, JEV caused an outbreak of 
Japanese encephalitis that sickened 45 people in 2022, killing six. Most cases were in the temperate southeast. 
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breaks from happening, then we're going 
to need to understand that animal-human- 
mosquito interface.” She, too, wonders 
about the future. “How often is this go- 
ing to happen. ... Is that driven by climate 
change? I think these are amazingly im- 
portant questions to ask.” 


ONE HUNDRED and eighty kilometers north 
of McCann’s hospital bed in Albury, just 
outside the tiny agricultural town of 
Grong Grong, Robert Johnston, a general 
manager for the Pig Improvement Com- 
pany (PIC) Australia, was dealing with 
an outbreak of his own. Johnston runs a 
2500-sow pig farm, raising 1200 animals a 
week for slaughter and supplying genetic 
stock, via semen shipments, to some 50% 
of the sows in Australia’s AU$5.5 billion 
pork industry. 

The piggery sits among flat canola fields 
not far from wetlands that are breeding 
grounds for dozens of species of water- 
birds. The recent rains had brought four 
floods in 8 months. At one point, Johnston 
says, flooding from the Murrumbidgee 
River, normally 5 kilometers away, came 
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within 300 meters of the farm. 

“There was water everywhere,” he recalls. 
“The mosquitoes along the outside [of the 
pig sheds] were horrendous.” Then in mid- 
January, several litters of piglets were born 
trembling—a sign of oxygen deprivation or 
trauma during birth. Most couldn’t suckle 
and were euthanized. “We knew there was 
something wrong,” Johnston says. 

Bernie Gleeson, a veterinarian with PIC 


Grong Grong 


Australia’s parent company SunPork Farms . 


who tends to the pigs at Grong Grong, had 
organs from several of the piglet carcasses 
sent to the state’s veterinary pathology lab, 
at the Elizabeth Macarthur Agricultural In- 
stitute (EMAI) near Sydney. But tests for a 
variety of likely viruses turned up nothing. 

Then on 22 February, he got an emer- 
gency call from Grong Grong. Early the 
next morning, he was in the farrowing 
shed at the piggery, watching in horror 
as, one after another, three sows birthed 
litters of mummified piglets with domed 
heads, missing brains, swollen bellies, and 
contracted limbs. 

Mummification is a well-known phenom- 
enon in which a piglet dies during gesta- 
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tion, dries out, and is later born shrunken 
and leathery, often alongside healthy litter- 
mates. It can happen in sows that are 
stressed, for example by poor nutrition or 
disease. Normally, the piggery’s combined 
losses due to stillbirths and mummification 
hover around 10%. But that week, losses 
nearly doubled. “We had total litters wiped 
out,” Johnston says, in what he later called 
“a firestorm of stills and mummies.” 

Donna McPherson, the piggery’s work 
health and safety coordinator, remembers 
how the grim business devastated employ- 
ees. “Particularly in the farrowing shed, 
they’re very nurturing people,” she says. “It 
had a huge effect on staff morale, going in 
day after day ... having to induce the sows. 
It was horrific for them to have to ..” she 
stops. “And then dispose of them.” 

On 23 February, Gleeson sent organs 
from 24 mummified and stillborn piglets 
to EMAI. The institute’s principal veteri- 
nary virologist, Deb Finlaison, told him 
that for 2 weeks, the institute had been re- 
ceiving a growing number of samples from 
stillborn and diseased piglets from across 
the state’s pig-producing regions. When 
Gleeson described the swarms of mosqui- 
toes that had descended on the region in 
November, they agreed she would immedi- 
ately test samples from the shaking piglets 
born at Grong Grong in January for a virus 
that suddenly loomed large: JEV. Unlike 
Kunjin and MVEYV, a hallmark of JEV is 
that it causes stillbirths and mummifica- 
tion in pigs. 

On 25 February, the government 
officially diagnosed the Grong Grong 
piggery with JEV. It soon put a stop- 
movement order on the operation, 
freezing all its pigs in place. On 
1 March, Australia’s health depart- 
ment made clear that the problem 
wasn’t isolated to Grong Grong: It 
announced the detection of JEV at 
eight pig farms, six of them in NSW. 
Eventually more than 80 Australian 
piggeries would be declared infected. 

The risk, it appeared, was not just to 
pigs. All four of the patients in Albury 
Hospital lived within 7 kilometers of 
a commercial pig farm. That’s well 
within the flying range of Culex annu- 
lirostris, the mosquito species that is 
JEV’s predominant vector in Australia. 


STARTING IN the mid-1800s, Japan ex- 
perienced waves of a mysterious en- 
cephalitis every 10 years or so. It felled 
children in particular and caused 
widespread alarm. In 1935, in the 
midst of one such outbreak, research- 
ers isolated the causative virus from 
the brain of a patient who had died of 
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the disease, which was soon labeled Japanese 
encephalitis. 

JEV has since spread across much of Asia 
and the western Pacific, becoming endemic 
in 24 countries from India to Taiwan and 
from China to Papua New Guinea. In 2016, 
there was a first case in Africa, and the virus 
has been found in birds in Italy. The World 


“| thought: 
‘Why here? Why now? 
What next?” 


Linda Hueston, 
New South Wales Health Pathology 


Health Organization estimates there are 
68,000 cases annually and at least 13,000 
deaths, although other estimates are higher. 

When a human is bitten by an infected 
mosquito, the virus multiplies in the skin 
and lymph nodes and travels transiently 
through the blood. At this stage most peo- 
ple are asymptomatic or experience fleet- 
ing symptoms such as lethargy and fever. 
But in a few cases—less than 1%—the virus 
dodges immune defenses and migrates to 
the brain, where it quickly kicks off pro- 
longed inflammation that kills neurons 
and produces dramatic symptoms. Typi- 
cally fever is accompanied by cognitive 
changes, headache and vomiting, and 
sometimes seizures or a masklike face, in- 


Sows at a piggery in rural Victoria. More than 80 Australian piggeries 
were found to harbor Japanese encephalitis virus in 2022. 
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dicative of harm to the same neurons that 
are damaged in Parkinson’s disease. 

Most of those affected in endemic coun- 
tries are young children, who haven’t had 
a chance to develop antibodies against the 
virus. (The same can be said of almost the 
entire Australian population.) As many as 
one-third of patients who develop encepha- 
litis die, and 30% to 50% of the survivors 
are left with serious deficits including 
weakness, convulsions, and severe cogni- 
tive impairment. No medicines to treat JEV 
have been approved, but safe and effective 
vaccines exist and have made a major dent 
in case numbers globally. Many poorly re- 
sourced endemic countries don’t vaccinate 
universally, however. The Philippines began 
to vaccinate only recently, and Bangladesh 
is just starting. 

Australia had been largely spared. A hand- 
ful of human cases on the country’s tropical 
Badu Island and the one case in northern 
Queensland in the 1990s were presumed 
to have originated from infected mosqui- 
toes in nearby Papua New Guinea that were 
borne south on high winds. Then in Feb- 
ruary 2021, a woman in Australia’s tropi- 
cal Tiwi Islands died after being infected 
with a rare genotype of JEV previously 
identified with certainty only in Indonesia. 

Now, that rare genotype of the virus had 
arrived 2500 kilometers farther south. Se- 
quencing of the virus that caused the 2022 
outbreak showed it was 99.8% identical to 
the one in the Tiwi Islands. 

Many experts think it could have made 
the journey in wading birds, such as 
herons and egrets. Some species are 
natural reservoirs for JEV, and young, 
nonimmune birds in particular can act 
as virus factories, transmitting it to un- 
infected mosquitoes. Decades of re- 
search from Asia show that human 
infections have tracked geographically 
with these birds’ ranges. 

Although the virus has never been 
detected in wild Australian water- 


egret species and one heron species 
that can be infected with JEV in the 
lab. And just before the 2022 out- 
break, bird life was booming. March 
2020 marked the end of a 3-year 
drought in the Murray-Darling Basin, 
a 1-million-square-kilometer area that 
drains the Murrumbidgee and Mur- 
ray rivers and that includes Corowa, 
Albury, and Grong Grong. Parched 
lakes began to refill, and in areas 
surveyed annually, wetland cover- 
age more than doubled between Oc- 
tober 2019 and October 2020. When 
the record-breaking rainfall arrived 
the next year, there was an “explo- 
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sion of waterbird breeding,” says Richard 
Kingsford, a waterbird ecologist at the Uni- 
versity of New South Wales whose team 
runs the annual aerial surveys of waterbirds 
in eastern Australia. 

These abundant waterbirds, many of them 
chicks that were especially susceptible to the 
virus, could help explain how pigs got in- 
fected in 2021 and 2022, Kingsford says. “As 
those chicks fledge, they will then disperse 
widely across Australia,” he says. “Where you 
have got a piggery, it’s quite possible that you 
would have one of these birds feeding on a 
nearby wetland and mosquitoes moving be- 
tween them.” 

Some experts think there’s a 
strong possibility that JEV ar- 
rived on the northern coast via 
migrating infected waterbirds 
in 2020 and bided its time— 
maintained in the birds and in 
feral pigs, of which Australia 
has millions—until the inun- 
dation of the Murray-Darling 
Basin beckoned the birds south 
in the spring of 2021. At that 
point, legions of pigs at com- 
mercial farms appear to have 
quickly ramped up an outbreak. 

But whether this is actually 
what happened may never be 
known. Other researchers have 
proposed that the virus traveled 
south in clouds of infected mos- 
quitoes borne from the north- 
ern coast to the southeast by 
a cyclone. 

The role of pigs in the out- 
break can’t be proved either, 
says Cameron Webb, a medical entomo- 
logist at NSW Health Pathology and the Uni- 
versity of Sydney. Human cases did cluster 
near piggeries. But it’s possible people were 
infected far more often by mosquitoes that 
bit infected waterbirds than by mosquitoes 
that bit infected pigs, he says. (Of note, no 
piggery employees are known to have been 
sickened in the outbreak.) 

Of one thing Webb is certain, however: 
“The emergence of Japanese encephali- 
tis virus is no doubt linked to La Nifia- 
dominated conditions we had in eastern 
Australia: lots of flooding, above average 
rainfall.” Those conditions favored not just 
the birds, but also the mosquitoes. 

Cx. annulirostris, abundant in the 
Murray-Darling Basin, relies on standing, 
fresh water to breed and develop. With the 
expansion of the area’s wetlands in 2021, it 
saw a population explosion. Webb and col- 
leagues later analyzed the landscape around 
dozens of domestic piggeries and found a 
strong association between transient wet- 
land expansion and piggery infection—a 
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link at least partly explained by the prolif- 
eration of mosquitoes. 

As with other weather events, the 
record-breaking wetness of the 2021-22 
season can’t be attributed with certainty 
to climate change. But as the globe warms, 
the atmosphere holds more water, enabling 
more intense rainfall and flooding; daily 
rainfall associated with thunderstorms in- 
creased between 13% and 24% in Australia 
between 1979 and 2016. 

“Fl Nifio and La Nina cycles are natural, 
but they are more extreme than they have 
ever been before,’ says Eloise Skinner, an 
epidemiologist at Stanford University and 


Egret chicks in a marsh in New South Wales. Egrets are natural reservoirs of JEV 


and young, nonimmune birds can amplify the virus. 


Griffith University. Those extremes can cre- 
ate and remove water sources, changing the 
distribution of species, including those that 
bear disease, she says. “As someone that 
looks at animals, diseases, and movement, I 
think climate change is critically important.” 


AUSTRALIA RESPONDED forcefully to the JEV 
outbreak, dialing up public health messag- 
ing about mosquito bite prevention and 
ultimately obtaining and distributing more 
than 125,000 vaccine doses to at-risk pop- 
ulations. Staffs at piggeries got priority: 
All but two of 26 employees at the Grong 
Grong farm rolled up their sleeves 11 days 
after the farm was diagnosed. Meanwhile, 
piggery managers combated mosquitoes by 
mowing grass, applying insecticides, remov- 
ing standing water, and repairing window 
screens in offices and lunch rooms. The 
Grong Grong farm saw a quick return to pig 
shipping, but like much of the region’s pig 
industry, it took a considerable financial hit. 

The government declared the hu- 
man outbreak officially over in June. But 
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Gleeson knows the threat hasn’t vanished. 
“We're making thousands of little virus fac- 
tories every week on our farms,” he says, 
“and presumably, they are all susceptible 
to infection and to amplifying the virus. 
It’s a problem.” 

For researchers, the arrival of JEV in 
Australia has motivated new efforts to 
understand what happened and _pre- 
dict future outbreaks. “We need a more 
holistic, integrated approach where we 
make sure that we can detect diseases, 
whether they be in animals, mosquitoes, 
water and soil, or humans, as early as pos- 
sible” says mathematical modeler Roslyn 
Hickson of James Cook Univer- 
sity (JCU) and the Common- 
wealth Scientific and Industrial 
Research Organisation. Earlier 
this year, she and JCU viro- 
logist Paul Horwood, with other 
colleagues, published a model 
of the risk of JEV transmission 
to humans across Australia, 
based on population density 
and the presence of relevant Cu- 
lex mosquito species, pigs, and 
waterbirds. Risk was highest in 
coastal and inland regions of the 
country’s southeastern quad- 
rant—including a large section 
of the Murray-Darling Basin. 

One of the lab’s postdocs, 
Anjana Karawita, is mining 
stored samples from trapped 
birds for genomic analyses that 
should yield the first data on 
JEV exposure in wild birds in 
the country. And the team is 
eying other unexplored possible reservoirs 
of the virus. “Australia has unique animals 
that aren’t seen in other areas where [JEV] 
circulates. So we have a lot of unknowns 
about what animals are going to be com- 
petent reservoirs,’ Horwood says. “This is 
something we really need to get on top of 
quickly.” 

Mosquito tracking efforts have ramped 
up, too. Now that the warm season, from 
October to April, has arrived, Webb is ana- 
lyzing mosquitoes from roughly 140 traps 
in NSW—including 40 newly placed in in- 
land areas since the JEV outbreak. In the 
lab, his team freezes the mosquitoes, counts 
them, and identifies their species, then 
grinds them up in an industrial-strength 
“cocktail shaker” for genetic analysis of 
what viruses they're carrying. If JEV is lurk- 
ing, the researchers hope to find it early so 
officials can issue public warnings and miti- 
gate human infections. (None was found in 
grinds from the 2022-23 summer season.) 

The trapping efforts have made clear 
that the retreat of the virus was not for 
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Cameron Webb examines a mosquito trap this month in mangrove wetlands along the Parramatta River in Sydney. Surveillance has expanded since the 2022 JEV outbreak. 


want of mosquitoes: A local surveillance 
crew in the Murray-Darling Basin in No- 
vember 2022 collected a record-breaking 
33,000 mosquitoes in one 16-centimeter- 
diameter trap. 

The NSW health department has mean- 
while added JEV testing to another pro- 
gram, which uses young chickens as early 
warning systems for human-sickening 
arboviruses. Affectionately known as “the 
girls,” the birds have never been exposed to 
mosquitoes before they are left in outdoor 
pens from November to April or May. Their 
blood is collected weekly. Blood from one 
chicken, collected in February 2022, tested 
positive for JEV antibodies, but newer col- 
lections have shown no sign of the virus. 

However, surveillance on the other side 
of the country has offered new warning 
signs. Sentinel chickens in Western Austra- 
lia tested positive for JEV exposure early 
this year. And in March, antibodies to the 
virus were identified for the first time in 
feral pigs in Western Australia. “What it 
means is that this virus is already wide- 
spread across Australia,’ Horwood says. 
“And that makes it more likely that the vi- 
rus will become endemic.” 


AFTER 5 DAYS in the hospital, Jack McCann 
was well enough to be discharged home. 
He had lost 11 kilograms. His mother 
teared up at the sight of her once-robust 
son lying on a couch, pale, and diminished. 
“At that time, I don’t think I realized just 
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how lucky Jack had been,” she says. 

Two weeks later, Thorburn found 
McCann’s cognitive functioning “within 
normal limits,’ though his mother describes 
his short-term memory during that time as 
“terrible.” Initially too exhausted even to 
mow his lawn, McCann gradually regained 
strength. This past August, he said: “I feel 
back to as good as I was before I was crook.” 

McCann was one of the 39 Australians 
diagnosed with JEV in 2022 who survived. 
Among the six others who were not so for- 
tunate was the 61-year-old man who was 
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The widespread Culex annulirostris mosquito is the 
main vector of JEV in Australia. 
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still on a respirator at Albury when McCann 
was discharged. He was removed from life 
support after 3 months. A government sur- 
vey of antibodies in blood samples collected 
in June and July 2022 from more than 
1000 people in Corowa and a handful of 
other relevant towns found that nearly one 
in 10 had been infected with JEV. 

As another summer approached in No- 
vember 2022, Thorburn and his colleagues 
braced themselves for more JEV cases. 
There were three by late December. Then 
the virus seemed to vanish. Instead, they 
got something still deadlier: an outbreak of 
disease caused by its close cousin, MVEV. In 
the state of Victoria, where Thorburn now 
lives and works, it sickened six people and 
killed five of them earlier this year. (Four 


others died in other states.) As he treated . 


some of these patients at Austin Hospital 
in Melbourne, “it was a bit eerie how simi- 
lar the presentations were” to the previous 
year’s JEV cases, he says. Here, too, he had 
no medications to attack the disease. 

Researchers are as baffled as doctors 
about the resurgence of MVEV. Skinner at- 
tended a mosquito control conference in 
August where much of the discussion was, 
“Where did JEV go? Why is MVEV here?” 
she says. To her mind, climate change has 
ushered in an age of uncertainty. “Weird 
things are happening with encephalitic vi- 
ruses in Australia,” she says. “We don’t know 
what’s coming.” & 
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Measuring the impacts of air pollution 


Reduced air pollution from coal power plants decreased mortality more than expected 


By Robert Mendelsohn! and Seung Min Kim? 


t is known that air pollution affects 
mortality, but measuring the magnitude 
of its impact is difficult. On page 941 of 
this issue, Henneman e¢ al. (1) combine 
careful mortality modeling and panel 
methods in a study of air pollution from 
coal power plants. A panel dataset was built 
using health records of Medicare recipients 
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across the United States over almost two de- 
cades. The authors matched this mortality 
data with predicted exposures to coal power 
plant emissions of sulfur dioxide (SO,), 
which decreased over the past two decades 
as a result of air pollution regulations and 
coal power plant retirements. Henneman 
et al. found that the reduction in mortality 
from these lower emissions has been more 
beneficial than earlier estimates suggested. 
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These results imply that the benefits of 
reducing coal emissions are greater than 
previously estimated and that reducing SO, 
emissions should be widely adopted. 
Henneman et al. followed the careful mor- 
tality modeling that has been widely adopted 
in the epidemiological literature (2). But they 
also moved from just studying mortality over 
space to studying mortality with panel data 
that vary over both time and space. Panel 
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methods can control for unwanted variation 
in just time or just space by including fixed 
effects (3, 4). The combination of both care- 
ful mortality modeling and panel methods 
is particularly innovative for the field. For 
example, many other panel studies of mor- 
tality have not been as careful to control for 
individual factors, such as sex, race, and, es- 
pecially, age (3, 4). 

Henneman e¢ al. specifically examined 
how mortality changes over both time and 
location as upwind SO, emissions from in- 
dividual coal power plants in the United 
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The mortality of older people is reduced in communities 
downwind of coal power plants, such as GenOn’s 
Cheswick Generating Station in Pennsylvania (now 
closed), when their emissions are lowered or stopped. 


States changed between 1999 and 2016. They 
followed insights from the epidemiological 
literature and used an exponential func- 
tional form to model mortality, individually 
controlling for sex, race, and age. They also 
controlled for many other variables such as 
temperature, income, and population den- 
sity, which vary over both space and time 
and could confound the results. Henneman 
et al. examined a vast dataset of individuals 
that encompassed 650 million person-years 
to carefully detect subtle changes in mor- 
tality rates. All of these features make their 
study an excellent example of how to carry 
out a panel study of mortality. 

Henneman ez al. found that a 1 pg m® 
exposure to fine particulate matter (PM,,), 
which is produced from the chemical trans- 
formation of SO, (5), from coal power plants 
increased mortality rates in people older 
than 65 years by 1.12%. This increased the 
average life expectancy of a 65 year old by 
a month. They predict that the annual coal 
power plant emissions and exposures from 
1999 through 2020 have led to 460,000 cu- 
mulative deaths among those over 65 years 
of age. The mortality impacts were espe- 
cially high in the eastern United States be- 
cause this region has more SO, emissions 
and has higher population densities than 
other regions of the United States. 

The study by Henneman e¢ al. also pre- 
dicts that the rapid decline of SO, emis- 
sions from coal power plants over the past 
two decades has led to a large reduction 
in excess deaths. This decline was caused 
by a combination of regulations and power 
plant closures. The predicted excess mor- 
tality from coal power plant emissions in 
2020 was just 3% of the predicted excess 
mortality in 1999. Thus, the US regula- 
tions that reduced SO, emissions from coal 
power plants were effective at protecting 
human health. 

A limitation of the study of Henneman et 
al. is that it focused solely on the SO, emis- 
sions from each coal power plant. The au- 
thors did not measure direct PM, , emissions 
from these plants. They did include the ef- 
fect of nitrogen oxide (NO,) on PM, , in one 
experiment, but even in this case, they did 
not consider how NO, combines with emis- 
sions of volatile organic compounds (VOCs) 
from other sources to form ozone (O,) (6). 
Both NO, and direct PM,,; emissions from 
coal power plants also fell over time be- 
cause of government regulations and disap- 
peared when coal plants were closed. Both 
of these pollutants could have contributed 
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to the observed air pollution mortality im- 
pacts (7) downwind of coal power plants. 

There is also a question about whether 
the calculations of air pollutant concentra- 
tions at the point of exposure are accurate. 
A problem specific to studying the impact of 
air pollution on mortality is that emissions 
at the source must be connected to concen- 
trations of pollutants at the point of impact. 
There are four dimensions to this measure- 
ment: emissions, dispersion, deposition, 
and chemical transformation. Henneman 
et al. accounted for many important fac- 
tors that affect downwind atmospheric 
concentrations, such as wind direction, 
wind speed, cloud ceiling height, and wet 
and dry deposition (which removes pollut- 
ants from the air). They also modeled the 
transformation of SO, into PM, .. However, 
they did not measure or predict the ammo- , 
nia (NH,) concentrations in the atmosphere 
from other sources, such as livestock feed- 
lots and ammonia-based fertilizers. NH, 
speeds up the chemical transformation 
of SO, into PM,, (8). Regions with more 
NH, would have relatively more PM,. as a P 
result. All of these factors are difficult to 
measure carefully across the entire United 
States. A more complete atmospheric chem- 
istry model that includes all sources of air 
pollution would improve the accuracy of * 
predicted exposures, but these models are 
expensive to run. Studies of mortality have 
come a long way in predicting pollution 
exposures that have likely contributed to 
mortality, but there is still room to improve 
their accuracy. 

The careful panel approach of Henneman 
et al. should be more widely adopted to study 
specific causes of death as well as morbidity. 
Their panel approach is also appropriate for 
studying the effect of many environmental : 
factors, such as warmer temperatures, on 
human health. Additionally, the analysis 
could also be expanded to include the value 
of clean water sources or the value of dif- 
ferent ecosystems on health. Finally, the 
analysis could be extended to public health 
measures that control the spread of many 
endemic diseases, such as malaria, dengue 
fever, and hepatitis. & 
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Mechanical properties pattern the skin 


Morphogens induce variations in tissue mechanics to promote feather budding 


By Samhita P. Banavar' and Celeste M. Nelson*? 


s tissues develop, their growth is 

guided by diffusible molecules known 

as morphogens. In some tissues, ini- 

tially uniform fields of cells become 

patterned over time to form spatially 

distinct anatomical features, such as 
hair follicles in the skin or villi in the intes- 
tine. This transition from anatomical unifor- 
mity to spatially patterned diversity is called 
symmetry breaking. Classically, symmetry 
breaking was thought to be initiated by spa- 
tial differences in gene expression caused by 
morphogens (7). On page 902 of this issue, 
Yang et al. (2) propose an alternative mecha- 
nism: spatial differences in the mechanical 
properties of clusters of cells, which give rise 
to mechanical instabilities at the “supracel- 
lular” scale. The study combines analyses of 
skin explants and cultured dermal cells from 
embryonic chicks (Gallus gallus domesticus) 
with a theoretical framework to show that 
patterning of the avian skin emerges from the 
mechanical properties of the dermal tissue, 
which are influenced by morphogens. 

The term “morphogenesis” derives from 
the Greek words “morphé,’ meaning shape, 
and “genesis,” meaning emergence, and is the 
process by which a tissue generates its form. 
Morphogens can provide signals that guide 
the gene-expression changes necessary for 
morphogenesis. It has largely been assumed 
that initiation of the changes in tissue shape 
at the start of morphogenesis is also due to 
intracellular signaling and alterations in gene 
expression downstream of morphogens. The 
emergence of these biochemical patterns 
would necessitate a “prepattern” in the con- 
centration of the morphogens themselves. 

The skin is a prototypic example of an 
organ whose underlying pattern of append- 
ages (hair follicles in mammals or feather 
follicles in birds) is guided by morphogens, 
such as fibroblast growth factor (FGF) or 
bone morphogenetic protein (BMP). Feather 
follicles are evenly spaced in the avian skin, 
which is composed of three layers: the out- 
ermost epidermis, the intervening base- 
ment membrane, and the innermost dermis. 
Previously, the locations at which hair or 
feather follicles formed were thought to be 
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determined by prepatterned spatial varia- 
tions in the concentrations of morphogens 
within the dermis. An initially homogeneous 
field of morphogens can resolve itself into 
discrete spots of high concentration from 
the interplay between chemical reactions 
and diffusion (3). However, recent work has 
reported that follicles also form in response 
to physical patterns. In mammalian skin, 
hair follicles emerge through cell rearrange- 
ments, which give rise to stationary cores of 
cells surrounded by a dynamic field of motile 
cells that consists of two popu- 
lations—one moving clockwise 
and one moving counterclock- 
wise (4). In avian skin, feather 
follicles bud at the same time 
as patterns of gene expression 
appear, rather than after (5). 

Yang et al. demonstrate that 
the emergence of the bud of 
the follicle begins with the ruf- 
fling of the basement membrane 
under the epidermal layer. The 
dermal layer then separates into 
two regions of cells: a stiffer, 
solid-like core and a more fluid 
margin, the mechanical proper- 
ties of which were confirmed using atomic 
force microscopy. Using transcriptomic and 
immunofluorescence analyses, the authors 
found that signaling downstream of FGF led 
to a stiffening of cells in the core. By contrast, 
signaling downstream of BMP led to a more 
fluid-like domain of tissue at the margin. 
After this supracellular phase separation, the 
basement membrane was degraded and the 
margin contracted, pushing the stiffer core 
out of the plane of the tissue and causing the 
feather bud to emerge. The symmetry break- 
ing that forms the pattern of feather follicles 
is thus a physical process, influenced by mo- 
lecular and genetic controls. 

Understanding the mechanical proper- 
ties of complex tissues is challenging (6-8). 
Yang et al. adopted methods from outside of 
developmental biology to study the extent to 
which the mechanical properties of dermal 
cell clusters might be modulated by FGF and 
BMP in culture. In one approach, the authors 
dried reconstituted dermal tissue after treat- 
ing it with FGF or BMP and then mapped the 
patterns of cracks to infer whether the tis- 
sues became more solid (brittle) or more fluid 
(ess brittle). In another approach, based on 
surface tension of coalescing droplets (9), ag- 
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“Symmetry 
breaking 
that forms 
the pattern 
of feather 
follicles is 
thus a physical 
process...” 


gregates of dermal cells were cultured with 
FGF or BMP and the extent to which they co- 
alesced was used to infer whether the tissues 
became more solid-like (in which aggregates 
failed to merge) or more fluid-like (in which 
aggregates merged). These approaches need 
to be validated against standardized models 
to determine whether they might be appli- 
cable to other developing tissue systems. 

Recent studies using mouse embryos 
hinted at mechanical underpinnings for sym- 
metry-breaking events during branching of , 
the pulmonary airways (JO, 11), 
condensation of digit cartilage 
(72, 13), and the emergence of in- 
testinal villi (14, 15). Collectively, 
these findings suggest that the 
process of morphogenesis re- 
lies on spatial patterns in the : 
physical properties (mechani- 
cal stiffness, volumetric growth, 
active forces) of the tissue that 
are determined by morphogens. * 
The results of Yang et al. now 
indicate that spatial patterns in 
physical properties might also 
serve as the triggers that induce 
uniform fields of cells to begin 
morphogenesis in the first place. 

The findings of Yang et al. show that mor- 
phogens can do more for tissue development 
than just alter gene expression; they can 
also, in principle, cause patterns of mechani- 
cal properties to emerge at the supracellular : 
scale, which can initiate populations of cells 
to change the overall shape of their constitu- 
ent tissue. Understanding how the interplay 
between molecular dynamics and supracellu- 
lar mechanics affects the emergence of mor- 
phological patterns will become a powerful 
asset in studies of tissue development. 


ny 
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Giving birth gives birth to neurons 


In mice, pregnancy results in new neurons that support recognition of pups 


By Gerd Kempermann' 


eurogenesis (the birth of new neu- 

rons from stem cells) is very limited 

in the adult brain but contributes 

to highly specific brain functions— 

most notably, learning and memory. 

Thus far, its contribution to other 
brain functions has been less clear. On 
page 958 of this issue, Chaker et al. (1) re- 
port that in mice, pregnancy elicits tran- 
sient waves of neurogenesis in specific sub- 
sections of the subventricular zone (SVZ), 
the neurogenic zone that produces new in- 
terneurons for the olfactory bulb through- 
out life. These subsections were barely 
neurogenic in the absence of pregnancy. 
Once in the olfactory bulb, the newborn 
neurons contributed to the recognition of 
the young mice by smell. The data make a 
compelling case for how adult neurogen- 
esis in the olfactory bulb contributes to an 
important brain function beyond learning 
and memory. 

The effects of pregnancy on adult neuro- 
genesis in the mothers have been studied 
in animal models for many years, often in 
the context of how changes in sex hormones 
across the life span affect the generation of 
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1 Pregnancy leads to many 
changes that signal to the 
mother’s brain. 


2 Changes associated with the 
pregnancy activate stem cell 
proliferation in the SVZ. 


GCL, granule cell layer; GL, glomeruler layer; LV, lateral ventricle; 
MCL, mitral cell layer; OB, olfactory bulb; SVZ, subventricular zone. 
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new neurons (2). Most of these studies fo- 
cused on the hippocampus, the neurogenic 
zone involved in learning and memory, and 
often reported a decrease in cell prolifera- 
tion and neurogenesis during pregnancy 
(3). This suggested a temporary decrease in 
the plasticity of the mother’s hippocampus. 
Few of these studies have investigated neu- 
rogenesis in the adult olfactory bulb. One 
study that did, however, found that the lac- 
tation hormone prolactin mediates a preg- 
nancy-dependent increase in olfactory bulb 
neurogenesis and thus a positive effect on 
mothering behavior in rats (4). 

Chaker et al. did not specifically explore 
the question of what mediates the selective 
response of different SVZ precursor cells 
to pregnancy and the differentiation into 
the described population of interneurons, 
which lasted only for the time when the 
pups were young and maternal attention 
was critical (see the figure). However, a 
very suggestive signaling connection be- 
tween the hypothalamus (the brain struc- 
ture overseeing hormonal control) and the 
activation of particular populations of SVZ 
precursor cells was previously reported 
(5). It is likely, though, that several mecha- 
nisms interact to regulate pregnancy-asso- 
ciated neurogenesis. 

The identification by Chaker et al. of 
the physiological regulation of adult neu- 
rogenesis and function of the newborn 
cells in the olfactory bulb has ethological 


Pregnancy stimulates adult neurogenesis 


relevance. In a previous study, disrupting 
adult neurogenesis in the olfactory bulb of 
female mice did not seem to change ma- 
ternal behavior, although it reduced gen- 
eral social interaction (6). Therefore, the 
findings of Chaker et al. indicate that the 
contribution of adult neurogenesis is more 
subtle and much more specific than previ- 
ously thought. Nevertheless, although the 
pregnancy-dependent activation of neuro- 
genesis in the SVZ might be selective, as 
the new findings suggest, it still seems to 
be broader than necessary to produce just 
the one described interneuronal popula- 
tion. Thus, there might be other functions 
to be found. 

The study of Chaker et al. also exempli- 
fies the growing trend of using single-cell 
transcriptomics data to define cell popula- 
tions and their functional state. Precursor 
cell mosaicism in the SVZ was originally 
observed by using preidentified markers, 
cell morphology, and information on cell 
development (7). A transcriptomics-based 
atlas of cell types in the mouse SVZ has 
since been published (8). The new study 
now indicates that different subregions of 
precursor cells respond differently to phys- 
iological stimulation, and transcriptomic 
profiling also allowed the identification of 
a functionally defined neuronal population 
in the olfactory bulb. 

In a previous study, single-cell tran- 
scriptomics were used to demonstrate the 


During pregnancy in mice (1), quiescent stem cells in the SVZ of the walls of the LV become 
activated (2). The daughter cells travel to the OB and become transient “pregnancy-associated” 
interneurons (3), which contribute to how mothers recognize their offspring by smell (4). 


3 Neuroblasts, produced by the activated 
SVZ precursor cells, migrate and 
differentiate into (@) new interneurons 
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4 The new neurons help 
mothers recognize their 
offspring by smell. 
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molecular equivalence between mouse 
and human precursor cell populations in 
the adult hippocampus (9), but also high- 
lighted important differences. Such a com- 
parison has not yet been performed for the 
SVZ (or the olfactory bulb), but Chaker et 
al. speculate that what they observed in 
mice might also be applicable to humans. 
Adult neurogenesis in the human olfactory 
bulb occurs at very low levels or might 
even be fully absent (J0, 11). There is stem 
cell activity in the human SVZ, but it is not 
yet known whether the cells that Chaker 
et al. observed in mice also persist in hu- 
mans and, if so, whether they contribute 
to a similar pregnancy-associated surge in 
adult neurogenesis. Many mothers report 
that their sense of smell changes during 
pregnancy, but despite anecdotal reports 
of hypersensitivity or odor intolerance, 
the available scientific literature does not 
support such an enhancing effect; a recent 
meta-analysis instead indicated that preg- 
nancy was associated with reduced odor 
discrimination and _ identification (12). 
However, after birth, the smell of the child 
is particularly important for forming the 
bonds between mother and newborn (13). 
It is tempting to speculate that this func- 
tion relies on pregnancy-induced neuro- 
genesis in the human olfactory bulb. 

Maternal behavior during pregnancy 
and the postpartum period has multiple 
effects on brain development in the off- 
spring of many species, including humans. 
Furthermore, studies in mice and rats 
suggest that this behavior has transgen- 
erational effects on adult neurogenesis— 
some transient, some lasting (14, 15). It is 
fascinating that plasticity in the mother 
thus appears to promote plasticity in the 
offspring. 
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Breaking a bottleneck for 
thermoelectric generators 


A phase diagram-based screen identifies optimal interface 
materials for devices that convert heat into electricity 


By Bo Xu and Yongjun Tian 


hermoelectric generators are solid- 
state devices that directly convert 
heat into electricity, rendering them 
invaluable for power generation and 
for applications that reuse otherwise- 
released heat energy. An illustrative 
example is the radioisotope thermoelectric 
generator, which has powered the Voyager 
probes in outer space for more than four de- 
cades. There has been a spectacular advance 
in the performance of thermoelectric mate- 
rials (]—3). However, their 
use in thermoelectric gen- 
erators has been hampered 
by the lack of robust mate- 
rials that constitute the in- 
terface between electrodes 
and thermoelectric materi- 
als within the device (4). On 
page 921 of this issue, Xie 
et al. (5) report a promis- 
ing strategy for identifying 
optimal interface materials 
for different thermoelectric 
materials. This could break 
the bottleneck in advancing thermoelectric 
power generation and ultimately reduce en- 
ergy costs and emissions (6). 
Thermoelectric generators typically op- 
erate under demanding thermal and me- 
chanical conditions, including substantial 
temperature gradients, thermal stresses, 
and mechanical fatigue (7). Beyond the 
intrinsic thermoelectric properties of 
materials, the interface between a ther- 
moelectric material and electrode holds 
considerable influence over the device's 
output performance and long-term stabil- 
ity. During operation, atomic diffusion and 
chemical reactions at the interface can give 
rise to device instability and degradation, 
particularly at elevated temperatures. To 
counteract these detrimental effects, the 
incorporation of efficient and stable barri- 
ers to atomic diffusion, referred to as ther- 
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“athe widespread 
implementation 
of commercialized 
thermoelectric 
generators will 
become increasingly 
feasible.” 


moelectric interface materials (TEiMs), 
is imperative. The conventional selection 
criteria for TEiMs revolve around both 
matching thermal expansion for mechani- 
cal robustness and aligning work function 
for low contact resistance (8). However, 
candidate materials have traditionally 
been identified by trial and error, reliant 
on intuition and experience. This process 
is time-consuming and costly. 

To deal with this dilemma, Xie et al. 
adopted a comprehensive approach that 
capitalizes on multicomponent phase dia- 
grams of target thermoelec- 
tric materials and selected 
metals. Phase diagrams 
were constructed through 
density functional theory 
(DFT) calculations, a quan- 
tum-mechanical method 
of studying the electronic 
structure and properties of 
materials. This approach 
affords the examination 
of thermodynamic phase 
equilibria within multi- 
component systems and 
imparts valuable information for selecting 
TEiMs on the basis of equilibrium chemi- 
cal reactions. The authors harnessed the 
widely used Open Quantum Materials 
Database (a resource of DFT-calculated 
thermodynamic and structural properties 
of more than 1 million materials) to com- 
pute these phase diagrams (9). 

In the case of MgAgSb, a high-perfor- 
mance thermoelectric material operating 
below 573 K (10), Xie et al. established 
quaternary M-Mg-Ag-Sb phase diagrams, 
with M representing a chosen metal ele- 
ment. These phase diagrams facilitated 
the screening of potential TEiMs that ex- 
hibit stable two-phase equilibrium with 
MgAgSb. MgCuSb was identified, synthe- 
sized, and characterized, revealing that 
its thermal expansion and work function 
closely matched those of MgAgSb. The 
phase boundaries between MgAgSb and 
MgcCuSb exhibited finite atomic interdif- 
fusion, indicative of an effective adhesion 
of these two materials. Notably, MgCuSb 
displayed exceptional thermal stability 
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and an interfacial contact resistivity (a 
measure of the difficulty for electrical cur- 
rent to flow across the interface) below 1 
microhm cm? in the annealed MgAgSb/ 
MgCuSb junction. Modules with MgCuSb 
and MgAgSb demonstrated a remarkable 
heat-to-electricity conversion efficiency 
of 9.25% across a 300 K temperature dif- 
ferential (see the figure). These outcomes 
were substantiated through international 
round-robin testing across three laborato- 
ries in three countries. Furthermore, Xie et 
al. successfully extended the TEiM screen- 
ing strategy to other thermoelectric mate- 
rials, including Bi, Sb, ,Te,, ZnSb, CoSb,, 
and ZrCoSb. 

A similar approach rooted in phase dia- 
grams was recently applied to the thermo- 
electric material GeTe (17). The resulting 
module, which used NiGe as the interface 
material for GeTe, achieved a record-high 
efficiency of 12% across a 545 K tempera- 
ture differential. However, the general ap- 
plicability and validation process of Xie 
et al. makes their screening approach for 
interface materials a substantial advance 
in the field. 

Phase diagram calculations are a widely 
adopted tool in traditional metallurgy and 
ceramics for predicting compound stabil- 
ity and guiding synthesis conditions (12). 
When combined with DFT calculations, 
this technique’s versatility and accessibil- 
ity render it advantageous for designing 
new, high-performance functional mate- 
rials, such as TEiMs (5, 11), high-entropy 
materials (13), and hydrogen storage ma- 
terials (74). Nonetheless, it is prudent to 


acknowledge some inherent limitations 
in the DFT methodology that could affect 
calculation results (/5). For instance, it 
performs poorly in dealing with weak in- 
teractions (such as van der Waals forces) 
and strongly correlated systems (such as 
transition-metal catalysts and systems 
containing localized electronic states). 
Furthermore, calculating phase diagrams 
at finite temperatures can be arduous and 
may overlook some high-temperature sta- 
ble phases (/1). 

As device fabrication techniques con- 
tinue to advance, the widespread imple- 
mentation of commercialized thermoelec- 
tric generators will become increasingly 
feasible. The strategy of Xie et al. may be 
the breakthrough needed to help revolu- 
tionize power generation by providing a 
clean and sustainable source of energy. & 
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Optimizing the interface 


Developing interface materials has lagged behind the discovery of thermoelectric materials, stalling the 
advance of thermoelectric generators. An approach based on the computation of phase diagrams for 
both materials identified MgCuSb as a robust interface for the high-performance material MgAgSb ina 


module (“‘p-type” leg) for a generator. 
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A fluctuating 
solution to 
the dolomite 
problem 


Episodes of dissolution 

and crystal growth stoke 
the formation of a 

common carbonate mineral 


By Juan Manuel Garcia-Ruiz 


he impressive massif of the Dolo- 

mite Mountains in Northern It- 

aly was formed almost entirely of 

CaMg(CO,),, a calcium-magnesium 

carbonate mineral discovered in 1791 

by the French naturalist Déodat de 
Dolomieu (7), who gave name to both the 
mineral—dolomite—and to the impressive 
Alpine rocky landscape considered by the 
architect Le Corbusier as “the most beau- 
tiful architectural work in the world” (2). 
Although abundant in ancient sedimentary 
rock, its rarity in modern environments has 
puzzled geologists for more than a century. 
Indeed, tackling this mystery in laboratories 
has proven formidable, hindering the study 
of this mineral—the so-called “dolomite 
problem.” On page 915 of this issue, Kim et 
al. (3) demonstrate that cycles of saturation 
conditions promote dolomite crystal growth 
in the laboratory. This discovery opens the 
door to investigating the geochemical pro- 
cess that influenced massive dolomite for- 
mation in the natural world. 

Dolomite is a mineral very similar to 
calcite, the calcium carbonate that forms 
the shells of many foraminifera, mollusks, 
and bivalves whose accumulation at the 
bottom of the seas created the well-known 
limestones. The difference between do- 
lomite and calcite is that the former con- 
tains equal parts calcium and magnesium. 
More notably, the calcium and magnesium 
atoms must be arranged in the crystalline 
structure of dolomite in a specific order, 
something that Le Corbusier, the canonic 
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crystal architect (4), would have been fas- 
cinated by. This crystalline order consists 
of layers of calcium and magnesium sepa- 
rated by layers of carbonate anions. 

Highly ordered dolomite is the most 
abundant component of most large car- 
bonate deposits (platforms) that accumu- 
late at or near sea level. However, although 
today’s seas are still as supersaturated in 
calcium-magnesium carbonate as those 
in the past, dolomite does not precipitate 
from them at ambient temperature; rather, 
the precipitation of dolomite is restricted 
to some hypersaline lagoons, helped by mi- 
croorganisms (5). Furthermore, dolomite 
is reluctant to crystallize at ambient tem- 
perature in the laboratory. How then were 
these enormous rock massifs such as the 
alpine Dolomites formed? 

One recent idea is that the ordered ar- 
rangement of calcium and magnesium 
atoms in the dolomite structure does not 
form during precipitation. Rather, cal- 
cium-magnesium carbonate precipitates 
in a disordered manner, and then order 
emerges from a slow maturation process 
on a geological timescale (6, 7). The surface 
of the initial disordered dolomite bears lo- 
cal ordered regions that are more stable. 
Disordered regions dissolve faster, and 
reprecipitation from the supersaturated 
solution forms another disordered surface 
that will itself bear ordered regions. The 
process moves the entire surface slowly to- 
ward order, on which a new layer of dolo- 
mite can assemble (see the figure). Crystal- 
lographic studies of geological formations 
of calcium-magnesium carbonate suggest 
that this ordering can last about 40 mil- 
lion years (8). 

When a mineral does not crystallize, it 
may be due to either a failure in nucle- 
ation—that is, the birth of an ordered nu- 
cleus (or seed) from constituents in solu- 
tion—or an inhibition of the growth of the 
seed. Kim et al. show that the problem of 
dolomite formation in modern seawater is 
not thermodynamically limited by nucle- 
ation. The authors used density function 
theory (electronic structure calculations 
used to compute a variety of properties) 
and kinetic Monte Carlo crystal growth 
simulations to prove that the growth of 
ordered dolomite becomes self-inhibited 
beyond a few layers, after formation of the 
nucleating seed. Then how is the ordered 
crystalline structure created? Simulations 
also showed that disordered regions dis- 
solve faster in the surrounding solution 
than ordered ones. 

A precipitate of disordered calcium-mag- 
nesium carbonate (a proto-dolomite) will 
reorganize into ordered dolomite by ripen- 
ing. Ripening, or maturation, is a very slow 
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Dolomite growth, layer by layer 
A maturation process is favored by fluctuations in the 
fluid chemistry, allowing episodic dissolution of the 
disordered regions and growth of the ordered ones. 


@Ordered © Disordered 


Dolomite layer 


Disordered dolomite precipitates 
onto an ordered layer 


The surface of 
disordered dolomite 
bears some ordered 
regions that dissolve 
faster. 


Cycles of dissolution- 
reprecipitation moves 


the entire surface 
toward order. 


Deposition of a 
disordered layer on 
a pre-existing 
disordered layer is 
inhibited. 


Dolomite Mountains, Italy 


process that can last millions of years at 
constant ambient temperature, but fluctua- 
tions in supersaturation values enhance the 
kinetics of this process (9). Using geochemi- 
cal values of the relevant parameters (such 
as the amplitude of fluctuations and nucle- 
ation activation barriers), the simulations of 
Kim e¢ al. also show that fluctuations in sat- 
uration values (leading to dissolution and 
growth) reduce dolomite formation time by 
seven orders of magnitude when compared 
with that from constant supersaturation. 
This dissolution and growth mechanism 
may explain why modern dolomite forms 
primarily in natural environments with pH 
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or salinity fluctuations. In other words, the 
dolomite crystal growth process requires 
episodes of dissolution. 

One challenge to experimentally vali- 
dating the results of the computational 
studies was the low values of the order- 
ing rate. Kim et al. faced the problem by 
using a transmission electron microscope 
and a tiny fluid cell in which they placed 
a 3-um-sized seed of crystalline dolomite 
immersed into a saturated solution of 
calcium-magnesium carbonate. To mimic 
fluctuations of the saturation values at am- 
bient temperature, the authors provoked 
the radiolysis of water molecules with an 
electron beam to lower the pH and cause 
dissolution of the crystalline seed. Subse- 
quently switching off the beam and return- 
ing conditions to a higher pH triggered 
growth of the seed. After 3840 fluctuations 
over 128 minutes, Kim et al. determined 
that the seed grew about 200 nm. Electron 
diffraction revealed that the growth was 
ordered dolomite. Although the goal of the 
experiments was to mimic coastal regions, 
the authors had to increase the tempera- 
ture to 80°C to accelerate the dissolution 
and growth process of dolomite. 

One challenge to growing larger crystals 
for analysis is that the experiment of Kim et 
al. could not be run longer. This is because 
undetectable evaporation of the solution in 
the tiny fluid cell would invalidate the ex- 
periment. Ideally, one could use a larger, 
airtight cell to grow larger dolomite re- 
gions. However, much more time would be 
needed to detect the overgrown crystals (for 
example, you would need almost 90 days to 
grow a seed of 100 um that could be better 
analyzed with x-ray diffraction. 

Certainly, singular events are relevant in 
geology, such as meteoric impact or catas- 
trophic floods. However, slow-kinetics 
phenomena primarily shaped the geologi- 
cal history of Earth. The findings of Kim 
et al. raise many questions about how geo- 
chemical fluctuations occur in the natural 
world over geological timescales and what 
factors influence the process. & 
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PLANT SCIENCE 


Air spaces bend light in plant stems 


Intercellular air spaces are necessary for phototropism in Arabidopsis thaliana 


By Christopher Whitewoods 


n the human eye, the lens and cornea 

focus light onto the retina, where pho- 

tosensitive pigments absorb it and allow 

the brain to form an image. This combi- 

nation of molecular sensing and physi- 

cal bending of light allows humans to 
respond more accurately to their surround- 
ings. Plants also sense and respond to light 
through photoreceptors—for example, by us- 
ing directional growth to bend toward strong 
light (phototropism). Decades of work has 
identified the molecular mechanisms that 
underlie phototropism (J, 2), but whether 
plants also physically alter beams of light 
to enhance their ability to respond was not 
known. On page 935 of this issue, Nawkar et 
al. (3) report that plants can use intercellular 
air spaces to actively modify the path of light 
and perform accurate phototropism. This 
finding highlights the importance of physi- 
cal organ structure in environmental sensing 
and opens new avenues to understand the 
role of air spaces in other contexts. 

Nawkar et al. show that intercellular air 
spaces in the hypocotyl (embryonic stem) of 
the model plant Arabidopsis thaliana scatter 
light and are necessary for phototropism. To 
do this, they identified a mutant line of the 
plant with a transparent hypocotyl that does 
not bend toward light. In this mutant, imag- 
ing methods including transmission electron 
microscopy revealed that the intercellular 
spaces in the hypocotyl and roots that are 
normally filled with air are instead full of wa- 
ter (see the images). These water-filled spaces 
allowed light to pass through the stem with- 
out scattering, making the stem appear trans- 
parent. Mutant plants responded normally in 
response to gravity, which shows that this is a 
specific response to light rather than a more 
general defect in environmental sensing or 
directional growth. 

To confirm that intercellular air spaces 
are necessary for phototropism, Nawkar et 
al. used vacuum infiltration to fill intercel- 
lular air spaces in wild-type plants with wa- 
ter. Again, seedlings with water-filled inter- 
cellular spaces were unable to bend toward 
light but performed gravitropism normally. 
The authors identified the causative locus 
in the hypocotyl air space mutant as ATP- 
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BINDING CASETTE G 5 (ABCG5). ABCG5 is 
a member of a large family of transporters 
that act on a wide range of substrates (4), al- 
though the substrate for ABCG5 is unknown. 
Nawkar et al. used transmission electron mi- 
croscopy to show that in the abcg5 mutant, 
the cell wall bordering intercellular spaces 
lacks an electron-dense layer. Thus, ABCG5 
might transport a cell wall component, such 
as cutin, suberin, or lignin, to waterproof 
the intercellular spaces. These components 
have all been proposed to form part of a 
hydrophobic cell wall layer that surrounds 
intercellular spaces (5, 6). 

Previous work has identified the abcg5 mu- 
tant (7), has shown that hypocotyls contain 
intercellular air spaces (8), and has shown 


Wild-type 


abceg5 mutant 
Ray 


intercellular air spaces, but in ATP-binding casette G 5 
(abcg5) mutants, the spaces are filled with water. 


that water infiltration enhances light trans- 
mission in several plant species (9). Nawkar 
et al. have linked these three observations to 
identify the molecular basis of tissue-level 
optical properties and to show that these 
properties confer the ability to respond to en- 
vironmental cues. 

Are air spaces important for light sensing 
in other organs, such as leaves and flowers? 
It is possible that light scattering aids differ- 
entiation between cell types on the upper and 
lower sides of the leaf. Attenuation of light 
through floral stems may promote sensing of 
directional light and allow heliotropic flow- 
ers, such as sunflowers, to follow the position 
of the Sun as it moves overhead. Targeted ge- 
netic ablation of other ABCG family members 
to generate plants with water-filled intercel- 
lular spaces in other tissues should begin to 
address these questions. 

The identification of ABCGS5 as a regulator 
of air space formation is also an avenue to 
understand the broader role of air spaces in 
other tissues and species. If other ABCG fam- 
ily members control air space formation in 
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organs, such as leaves or flowers, then plants 
with mutations in these genes would allow 
the function of air spaces throughout the 
plant to be investigated. Questions include 
whether air spaces affect light scattering in 
petals to make them more attractive to pol- 
linators and whether there are cell type-spe- 
cific roles for air spaces in the leaf. It would 
also be interesting to establish whether 
ABCG proteins evolved to maintain the en- 
larged air spaces that allow many aquatic 
plants to float. 

Such a clear function for air spaces in 
the stem may pose an answer to the ques- 
tions of how and why plants evolved to have 
these spaces. It has long been appreciated 
that air spaces function in leaves to promote 
gas exchange and light scattering (J0), but 


intercellular spaces can be observed in the : 


stems of 400-million-year-old fossil plants in 
early Devonian rocks of the Rhynie Chert in 
Scotland—which is long before plants evolved 


to have leaves (17). The function of these stem ~ 


intercellular spaces is not clear, but it may be 
that, as in the A. thaliana hypocotyl, they are 
air-filled channels that allow effective pho- 
totropism. This interpretation suggests that 
plants evolved to have air spaces in stems, 
at least in part, to regulate light sensing and 
that these spaces were co-opted to perform 
secondary functions in leaves. 

Overall, the study by Nawkar et al. dem- 
onstrates the importance of looking in un- 


ny 


* ie 
expected places to answer long-standing 


questions and is a reminder to look beyond 
molecular mechanisms and consider the 
physical structure of an organism when 
thinking about function. It is also a re- 
minder that, like animals, plants actively 
modify their sensory inputs to better re- 
spond to their environment. 
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M. S. Swaminathan (1925-2023) 


Architect of India’s green revolution 


By Sameen Ahmed Khan! and 
Sanjay Vasant Deshmukh? 


onkombu Sambasivan Swaminathan, 
the plant geneticist whose work 
catalyzed India’s agricultural renais- 
sance, died on 28 September. He was 
98. Swaminathan worked tirelessly 
to revolutionize agriculture and en- 
sure food security and sustainable resource 
management. He aimed to improve crop 
yields, promote ecological and economic sus- 
tainability, and empower small farmers while 
integrating cutting-edge technology and 
promoting gender equality in agriculture. 
Swaminathan’s relentless research and advo- 
cacy enabled an enduring green revolution. 

Born in Kumbakonam, Tamil Nadu, India, 
on 7 August 1925, Swaminathan received 
a BSc in zoology from the University of 
Travancore (now the University of Kerala) in 
Kerala, India, in 1944, and a BSc in agriculture 
from the University of Madras in Chennai, 
India, in 1947. After brief fellowships in New 
Delhi and Wageningen, Netherlands, he re- 
ceived a PhD from the School of Agriculture 
at the University of Cambridge in 1952. 
Swaminathan studied cytogenetics and po- 
tato breeding at the University of Wisconsin- 
Madison for a year. He then returned to India 
and joined the indica-japonica rice hybridiza- 
tion program at the Central Rice Research 
Institute in Cuttack. Later in his career, and 
after his retirement in 1988, he held numer- 
ous leadership, consultant, and government 
positions in India and abroad. 

In 1950, Swaminathan began working to- 
ward his vision of food security by standard- 
izing techniques that allowed the breeding 
of previously infertile hybrid plants, thereby 
setting the stage for sustainable crops. Soon, 
he had created a potato hybrid that included 
alien genes conferring resistance to frost. 
Next, he pioneered efforts to improve the 
yields of fragile indica rice varieties by cross- 
ing them with hardier japonica varieties. 

In 1958, Swaminathan offered insights 
into how to induce mutations in wheat and 
rice, expediting the development of desired 
traits. This work led to a better understand- 
ing of the effects of food irradiation, a process 
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that uses ionizing radiation to enhance food 
safety without altering yield. Swaminathan 
also unraveled the genetic relationships 
among wheat species, and in 1963, he initi- 
ated a breeding program that incorporated 
dwarfing genes into wheat, producing 
shorter, stronger plants that boosted yields. 
A rice breeding initiative followed, in which 
Swaminathan created basmati strains that 
stood tall without breaking, even when bear- 
ing heavy grains. The release of Pusa Basmati 
1121, arice hybrid with this trait, ensured high 
yield and quality in basmati rice production, 
a revolution for food security and farmers. 


The green revolution in the late 1960s, 
made possible by Swaminathan’s work, 
transformed agriculture by introducing high- 
yield crop varieties and modern techniques. 
These advances in breeding included “crop 
cafeterias,’ in which diverse crop varieties 
were grown together, offering a balanced diet 
and improved nutrition, reducing the risk of 
dietary deficiencies. Instead of using fixed 
crop schedules, farmers could use crop dis- 
tribution agronomy, an approach that allows 
midseason adjustments in crop selection and 
planting, to optimize yield and food quality. 

Swaminathan’s creative approach  ex- 
tended into the digital era. In 1997, he estab- 
lished the first computer-aided rural knowl- 
edge centers, which promoted agricultural 
innovation and knowledge dissemination. He 
understood that digital technology had the 
power to advance rural prosperity. 

Swaminathan chaired the UN’s Advisory 
Committee (now Commission) on Science 
and Technology for Development from 1981 
to 1984, served as director of the International 
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Rice Research Institute from 1982 to 1 oe 
and presided over the International Union 
for Conservation of Nature from 1984 to 1990. 
He held advisory roles in the Indian govern- 
ment, led the Indian Council of Agricultural 
Research from 1972 to 1979, and served as 
a member of the Indian Parliament’s upper 
house, the Rajya Sabha, from 2007 to 2013. 
Awards recognizing Swaminathan’s 
work include 85 honorary doctorates, the 
Mendel Memorial Medal in 1965, the Ramon 
Magsaysay Award in 1971, and the prestigious 
Padma Shri in 1967, Padma Bhushan in 1972, 
and Padma Vibhushan in 1989. In 1986, he 
received the Albert Einstein World Award of 
Science, and in 1987, he was the inaugural 
laureate of the World Food Prize, often con- 
sidered to be an agricultural Nobel Prize. 

Author SV.D. first met Swaminathan in 
1988 and assisted him in establishing the , 
M. S. Swaminathan Research Foundation 
(MSSRF) in Chennai, India. The foundation’s 
mission is to harness science and technol- 
ogy for sustainable agricultural and rural 
development. They collaboratively devised 
an innovative global “hotspot” strategy for 
preserving the genetic resources of mangrove 
forests worldwide, garnering more than $500 
million in funding over a span of three de- 
cades. Between 1991 and 1997, author S.A.K. 
frequently met with Swaminathan as a PhD 
scholar from the Institute of Mathematical 
Sciences, an institution neighboring MSSRF. 

Swaminathan’s wisdom transcended aca- 
demia. He understood the intricate interplay 
between science and society. His commit- 
ment to translating knowledge into tangible 
solutions was exemplified by his initiation of 
community-based gene, seed, and grain man- 
agement strategies in 1998, which involved 
local participation and sustainable practices 
to ensure food security. He also engaged with 
critics and the movement protesting geneti- 
cally modified organisms by advocating for a 
balanced approach to biotechnology and em- 
phasizing safety and ethical considerations. 
Seamlessly blending vision and pragmatism, 
Swaminathan created disaster management . 
strategies and developed a scientific mon- 
soon management plan in 1979, one of many 
examples of his proactive approach to ad- 
dressing pressing challenges. 

At the core of Swaminathan’s character 
was a profound sense of compassion. He 
championed technology development strat- 
egies that embodied empathy and an un- 
wavering commitment to comprehensive 
progress, particularly for impoverished and 
food-insecure people, especially women. He 
recognized that genuine advancement en- 
compassed not only science and technology 
but also the well-being of all, marginalized 
communities included. & 
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Mining stakes claim on salmon 
futures as glaciers retreat 


Future ecological value of emerging habitats must be 
considered as climate change transforms the planet 


By Jonathan W. Moore’, Kara J. Pitman’ 
Diane Whited?, Naxginkw Tara Marsden‘, 
Erin K. Sexton’, Christopher J. Sergeant?, 
Mark Connor* 


s climate change warms Earth, the 

melting cryosphere creates nascent 

ecosystems that have future value as 

habitat but that are also the front- 

lines for resource extraction (J). For 

example, glacier retreat uncovers 
rivers and valleys that go through rapid 
ecological succession to provide new habi- 
tats for important species, such as moose 
and Pacific salmon (2—5). However, mining 
companies are looking to retreating glaciers 
for newly exposed mineral deposits (6). 
This proglacial mining is a global pressure, 
from Greenland to Kyrgyzstan to western 
Canada (6). Yet environmental and mining 
policies might fail to consider the future 
ecological value and capacity of emerging 
habitats. We illustrate these issues below 
by exploring the overlap of glacial retreat, 
Pacific salmon future habitats, and mining 
pressures in western Canada and southern 
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Alaska. Stewardship of glacierized land- 
scapes, and other ecosystems that are being 
transformed by climate change, urgently 
need forward-looking science and environ- 
mental policy. 

Migratory Pacific salmon support econo- 
mies, cultures, and ecosystems and are ex- 
panding into and populating rivers in west- 
ern North America as glaciers retreat (2-4). 
Sixty to 100% of glaciers are predicted to 
disappear from western Canada by 2100 (7). 
Although glacier retreat will decrease wa- 
ter storage and cooling capacity that poses 
downstream risks to people and aquatic 
ecosystems (8), linked models of climate 
change, glacial retreat, and salmon habitat 
forecast the creation of thousands of kilo- 
meters of new salmon rivers over the com- 
ing decades in western North America (4), 
a potential partial offset for losses in other 
salmon populations due to climate warm- 
ing and other stressors. If these emerging 
river systems are protected, they can pro- 
vide future habitats for important aquatic 
species such as salmon and also early-suc- 
cession riparian habitats and wetlands that 
support moose and other wildlife (5). 

The future capacity of these emerging 
habitats could be profoundly altered by 
industrial mining. Although mining can 
provide materials critical to humanity, in- 
cluding those to support low-carbon tech- 
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The Taku River of northern BC, Canada, and soutt 
Alaska, USA, contains high mineral potential and 
retreating glaciers, opening up habitats for salmon. 


nologies, mining can destroy habitat, alter 
hydrology, and contaminate soils and wa- 
ter (9). Mining companies may also remove 
glacier ice with machinery or explosives to 
access mineral deposits or protect infra- 
structure (6). 

Although prior research has documented 
where and when future salmon habitats will 
be created with glacier retreat (4), studies 
have not yet assessed where and how these 
future habitats intersect with mining pres- 
sures. Here we (i) identify the geographic 
overlap of glacier retreat, salmon habitat 
gains, and potential mining pressure; and 
(ii) examine the policy barriers and op- 
portunities for linking this information 
to action. These analyses provide specific 
geospatial information to inform proactive 
land-use planning and conservation and 
reveal critical policy gaps and opportuni- 
ties. We focus on the transboundary region 
of North America where rivers and poten- 
tial mining impacts cross the boundaries 
of northern British Columbia (BC), Canada, 
and southern Alaska (AK), USA. In this 
96,525 km? heavily glacierized region, pro- 
jected salmon habitat creation is substan- 
tial (4) and contains the “Golden Triangle,” 
a mining hotspot of mineral-rich geology in 
the western Stikine terrane. Most mining in 
this region appears to be targeting gold de- 
spite only 8% of global gold being used for 
societally important technology (J0). 


POLICY CONTEXT 
This transboundary region predominantly 
occurs in BC, Canada. A critical BC policy 
is the Mineral Tenure Act, whereby mining 
is currently a free-entry process and claims 
can be staked through an online portal by 
companies or individuals for a nominal fee 
without consultation. Mining claims grant 
the right for exploration, which can have 
its own environmental impacts (9). Mine 
claims also give companies the right for 
future mineral development; even if claims 
are speculative or stagnant, they thus pose 
barriers for forward-looking planning and 
conservation. Staking is generally allowed 
on all types of land, unless explicitly forbid- 
den such as in protected areas or No Regis- 
tration Reserves. Indeed, mining companies 
can currently stake claims on the unceded 
territories of First Nations and preemp- 
tively on glaciers before land is exposed. 
Prior to development, mines may go 
through a provincial or federal environ- 
mental assessment and are subject to other 
environmental policies that could regulate 
potential impacts. However, neither BC nor 
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Canadian environmental assessment laws 
mandate incorporation of climate change 
forecasts and future habitat values into 
evaluation of environmental risks. Further, 
once a project is deemed “substantially 
started” by BC, the environmental assess- 
ment certificate that grants the rights to 
mine development can be held in perpetu- 
ity. Thus, current policies do not regulate 
potential mining impacts on the future 
habitat values of locations that are being 
transformed by rapid climate change. 

These colonial policies are being applied 
to Indigenous lands and waters—almost 
all of BC is on lands whose rights and title 
have never been ceded by First Nations. 
The BC Declaration on the Rights of 
Indigenous Peoples Act and the Canadian 
Constitution recognize the inherent rights 
of Indigenous Peoples to steward their 
territories, but these recognitions have yet 
to be incorporated into many colonial en- 
vironmental policies such as the Mineral 
Tenure Act. With the increased recogni- 
tion of Indigenous laws, the lands and re- 
sources in this region are governed by legal 
pluralism. 


FUTURE SALMON HABITAT 
AND MINING PRESSURE 
We determined where there is overlap 
between future salmon habitat, mining 
claims, and mineral potential (4) (see 
supplementary methods). Hundreds of 
kilometers of future salmon habitat have 
been staked by mining companies for min- 
eral exploration (see the figure). Across 
the 114 subwatersheds forecasted to have 
new salmon habitat, 25 had more than 
50% of future habitat within 5 km of min- 
ing claims, and 17 had more than 90%. 
The overlap of mining claims and future 
salmon habitat varied immensely across 
the eight focal watershed regions. For ex- 
ample, 99% of 114 km of future salmon 
habitat were within 5 km of claims in the 
Nass and 62% of 279 km in the Taku, but 
12% of 472 km in Central Southeast Alaska 
(CSE AK) and 10% of 2011 km in the Alsek 
(see the figure and table S1). There was also 
high variation within subwatersheds (see 
the figure). For example, for the 14 subwa- 
tersheds within the Stikine, from 0 to 100% 
of future salmon habitat was within 5 km 
of mining claims (table $2). Across all wa- 
tershed regions, 564 km of future salmon 
habitat (out of 4973 km) were staked (11%) 
within 5 km, and 286 km had claims di- 
rectly on them (6%) (table S1). Thus, min- 
ing companies have already staked claims 
over substantial future salmon habitats. 
The majority of future salmon habitats 
also have considerable mineral potential, 
an additional index of future mining pres- 
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Future salmon habitat and mining claims 

For each of eight focal regions (denoted by thin black lines), shown are the 
projected future salmon habitats by subwatershed, the river lengths of 
those habitats under complete glacier retreat, and proportion of the future 
habitat that has a mining claim within 5 km. 
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sure. Of the future salmon habitat that 
has been assessed for mineral potential 
(Canada only, 2303 km), 53% of future 
salmon habitats were directly assessed 
as either high (634 km) or medium (570 
km) mineral potential (tables S1 and S2). 
Mineral potential varied across watershed 
regions and was particularly high in the 
Stikine (82%; this and the following are 
the future salmon habitat directly assessed 
as high or medium mineral potential) and 
Taku (94%). 

These analyses focus on _ projected 
salmon habitat quantity with glacier re- 
treat, and the timing of its availability 
will vary with emissions scenarios (4). 
The quality of salmon habitat is also likely 
evolving rapidly. As glaciers retreat, down- 
stream floodplain succession and channel 
stabilization may further improve salmon 
habitat over decades (2, 3), a potentially 
large-scale change that is a key research 
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priority. Complete loss of glaciers may 
also increase vulnerability of downstream 
aquatic ecosystems to droughts and heat 
waves (3, 8). 

Here we reveal the degree to which min- 
ing claims and mineral potential overlap . 
with emerging salmon ecosystems, but 
how much of this potential risk will trans- 
late to direct impacts? Some claims are ex- 
ploratory or speculative and may never be 
developed. However, the study area is un- 
dergoing a gold rush with many new explo- 
rations and major mines and substantial 
mineral potential, indicating that develop- 
ment of claims into major mines is quite 
possible (9). The region is remote, and 
access could limit mining development, 
but there have been major investments in 
northern BC for thousands of kilometers 
of transmission lines to enable mining de- 
velopment (9). Mining companies are also 
increasingly under pressure from their 
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investors to improve their environmental, 
social, and governance responsibility (11); 
however, mining has a long history of pro- 
foundly degrading and contaminating both 
terrestrial and aquatic habitats (9) and has 
been criticized for continuing to underes- 
timate environmental risks and failing to 
effectively mitigate damages (9, 12). The 
scale and type of mining will also deter- 
mine environmental impacts that could be 
either smaller or cumulatively larger than 
the 5-km buffer used in this analysis (9). 
Glacial retreat can also increase hazards 
such as glacial lake outburst floods or land- 
slides that compound environmental risks 
posed by mines (6). Thus, the actual im- 
pacts of mining on future salmon habitats 
will be determined by the efficacy of current 
and future policies. 


POLICY OPTIONS FOR PROTECTING 
CLIMATE FUTURES 

Our analyses delineate the locations of over- 
lap between mining claims and future values 
for salmon and other species. Notably, these 
habitats that are emerging from ice may be 
considered of negligible current value to 
salmon and other species during risk assess- 
ments and thus omitted from protections of- 
fered by environmental laws such as BC and 
Canadian environmental assessment laws 
and the Fisheries Act. These policies focus on 
risks for activities to harm the current eco- 
systems, but do not mandate consideration 
of risks to future habitat values. Given that 
mining impacts can persist for decades to 
centuries or more (9), our analyses identified 
the key subwatersheds where environmen- 
tal risk evaluations should consider poten- 
tial harms to future salmon habitats. More 
broadly, there is an urgent need to mandate 
that risk assessment policies incorporate the 
best available scientific understanding of 
forthcoming climate change transformations 
to balance the protection of future environ- 
mental values and benefits with mining and 
other industrial pressures. 

This study also identified many subwater- 
sheds that contain unstaked future salmon 
habitat (table S2), representing opportuni- 
ties for targeted protection of future salmon 
habitat through land-use plans or protec- 
tions before stakes are claimed. Once claims 
have been staked, land protection is chal- 
lenging given the current mining legislation; 
governments can potentially buy out claims 
for large sums of money only if mining com- 
panies are willing. Across the vast and re- 
mote study region, we identified locations of 
extensive areas that could be protected with 
land-use designations for salmon futures be- 
fore they are staked. There is precedence for 
such targeted protections—habitats emerg- 
ing from the retreating Mendenhall Glacier, a 
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tourist destination near Juneau, AK, recently 
received protection from mining develop- 
ment by US federal agencies. Policy options 
in Canada include No Registration Reserves 
under the Mineral Tenure Act, Section 17 
designations under the Lands Act, protec- 
tions under the Park Act, and Ecologically 
Significant Areas under the Fisheries Act. 

Indigenous Protected Areas (IPAs) are an 
important policy option for forward-looking 
conservation in the current context of legal 
pluralism, including where colonial govern- 
ments are beholden to mineral interests 
which have been staked without Indigenous 
consultation or consent. Indigenous groups 
are witnessing rapid climate change and are 
declaring IPAs for proactive protection of fu- 
ture salmon habitats as changes to habitat 
quantity and quality play out in real time. 
For example, in 2021, Gitanyow Hereditary 
Chiefs declared that the Wilp Wii Litsxw 
Meziadin IPA in the Nass was off-limits for 
mining because of recent observations of 
increases in salmon associated with glacier 
retreat. Although two historically impor- 
tant salmon creeks were already protected 
thanks to the Gitanyow Lax’yip Land Use 
Plan in 2012, recent data revealed that sub- 
stantial sockeye were actually spawning 
in Strohn Creek owing to glacier retreat, 
which historically had not been a substantial 
spawning habitat. In addition, in 2023, the 
Taku River Tlingit declared the T’akti Tlatsini 
IPA, an extension of previous protections to 
include glaciers and future salmon habitat. 
It remains to be seen to what degree the BC 
government will support or impede these 
forward-looking protections. 

An alternative proactive policy option 
would be for the US and Canada to pro- 
vide broad legislative protection of glaciers 
and the habitats that arise from them, in 
accordance with other countries such as 
Argentina (73). 

Our findings also highlight the urgent 
need to reform the Mineral Tenure Act of 
BC (/4). Gitxaata First Nation and others 
have challenged the Mineral Tenure Act in 
court on the basis that it violates their fun- 
damental and constitutional Indigenous 
rights to steward their own waters, lands, 
and resources and called for the reform of 
free-entry claim staking that enables min- 
eral exploration without consultation. Other 
First Nations are also working with mining 
industries and the BC government in new 
consent-based decision-making processes 
that advance Indigenous rights and envi- 
ronmental sustainability. Overhaul of the 
Mineral Tenure Act and incorporating min- 
eral claim-staking into broader government- 
to-government land-use planning efforts 
would help to advance Indigenous rights and 
enable forward-looking and balanced land- 
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use planning rather than having landscape 
trajectories be driven by market values and 
mining companies. 


CONCLUSION 

The nexus of glaciers, salmon, Indigenous 
rights, and mining is a globally relevant 
example of the urgency of forward-looking 
science and policy for climate resilience 
and environmental justice. Here we pro- 
vide the spatial information to inform 
proactive stewardship of climate futures 
for Pacific salmon even as they struggle in 
much of their range with climate change 
and other stressors, and we identify policy 
options and reforms to protect future habi- 
tat in glacierized watersheds. From glacier 
retreat to sea-ice retreat, rapid climate 
transformations are exacerbating industry 
pressures, and current policies are lagging 
behind the rapid pace of change. There is 
an urgent and widespread need to criti- 
cally evaluate and reform colonial policies 
that were built on a static and extractive 
view of ecosystems and are barriers for cli- 
mate adaptation. Concurrently, there is a 
recognized need for global action to reduce 
greenhouse gas emissions to slow the pace 
and magnitude of climate change trans- 
formation (7, 15). The proactive protection 
of climate futures demands policy reform 
to enable forward-looking environmental 
decision-making for resilience and adapta- 
tion (1). 
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Role-playing climate resilience 


An optimistic game proves that solving hard problems 


can be serious fun 


By Valerie Thompson 


ow can we meet current and future 

energy demands while reducing fos- 

sil fuel use? Should we focus on tech- 

nological solutions or on fortifying 

community resilience? On local or 

global projects? These and other vi- 
tal questions will inform discussions at the 
upcoming United Nations climate change 
conference (COP28) in Dubai, where world 
leaders will soon come together in pursuit 
of acommon goal: to mitigate carbon emis- 
sions and stabilize rapidly warming global 
temperatures. They are also the fodder 
of Daybreak, a new cooperative climate- 
themed board game. 

Daybreak cocreator Matt Leacock has 
a knack for extracting satisfying game- 
play from notoriously tough topics. His 
2008 game Pandemic, in which players 
work together to combat fictional dis- 
ease outbreaks, was a favorite among 
board game enthusiasts long before it 
was embraced by a wider audience during 
COVID-19 lockdowns. 

Like Pandemic, Daybreak brings an op- 
timistic attitude to its subject matter, fo- 
cusing on cooperative action that can be 
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taken to tackle a complex problem. Its title, 
for example—updated from the decidedly 
gloomier “Climate Crisis” used during 
early playtesting—is intended to evoke a 
hopeful vision of a clean energy future. 
And like Pandemic, it is seri- 
ously fun to play without trivi- 
alizing the difficult topic with 
which it engages. 

Players begin by assuming 
the role of four world powers— 
Europe, the United States, China, 
and the Majority World—each of 
which has different energy de- 


~~} 


Chec 


As global temperatures rise, players must streng’ upd: 


munities in crisis, or if the group has not 
achieved drawdown by the end of round 6, 
the game is lost. 

A satisfying “engine-building” mechanic— 
a feature, found in many popular board 
games, in which card effects are enabled 
or amplified by others—underscores the 
intersecting and complementary nature of 
various climate interventions. The “walkable 
cities” card, for example, allows a player to 
remove one transportation emission token 
per round; however, it can only be played 
in combination with two other regula- 
tory cards. Similarly, the “early warning 
systems” card allows players to preview 
upcoming crises and fortify community 
resilience accordingly, and this action can , 
be repeated and becomes more powerful 
when other innovation cards are present. 

Daybreak rightly focuses on societal- 
level climate mitigation strategies, empha- 
sizing collective actions, such as rewilding, 
the implementation of universal public 
transportation, and increasing the efficacy 
of ocean shipping, rather than individual 
responsibilities, such as recycling or hav- 
ing fewer children. It is also expansive 
in its embrace of actions that strengthen 
planetary health, valuing practices such 
as inclusive immigration, the education of 
women and girls, and Indigenous steward- 
ship. Meanwhile, many of the crises players 
face represent pointed indictments of fossil 
fuel companies, referencing the industry’s 
lobbying efforts, its negligence, 
and its disinformation tactics. 
QR codes on each card helpfully 
link to short descriptions of the 
philosophy and/or science that 
underlies each project, offering 
greater context for less-familiar 
strategies, such as green quanti- 
tative easing, as well as for more 


mand and sources, emissions, Daybreak controversial approaches, such 
resilience, and vulnerable popu- Matt Leacock and as stratospheric sulfur geoengi- 
lations (“communities in crisis”). Ber ae neering and cloud brightening. 

Through a combination of lo- cMvie 2023. As it is in real-life climate ne- 


cal and global projects, players 

work together to reduce emissions, replace 
dirty energy sources with clean ones, and 
build resilience in order to survive various 
planetary crises. As unsequestered carbon 
accumulates, a temperature tracker ticks 
upward, amplifying planetwide effects and 
global crises. 

To win, players must achieve “draw- 
down,” which is accomplished when the 
group’s collective carbon can be fully se- 
questered by trees, oceans, and direct air 
capture. If the temperature increases by 
2°C, if any player ever has 12 or more com- 
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gotiations, the central tension of 
Daybreak lies in the trade-offs that must be 
made. Stopgap strategies will sometimes 
take precedence over long-term solutions. 
Global efforts may languish as populations 
prioritize local projects. Crises will com- 
pound and exacerbate one another, mak- 
ing the most meaningful next step difficult 
to discern. Ultimately, there is no one right 
path forward, and success is never prom- 
ised. One thing, however, is certain: We 
stand a chance if we work together. 


10.1126/science.adl4244 
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A new philosophy of STEM work 


More can be done to recruit, train, and retain scientific 
professionals, argues a sociologist 


By Jonathan Wai? and Maya Wai? 


n Wasted Education, sociologist John 

Skrentny maintains that a large portion 

of US science, technology, engineering, 

and math (STEM) education is currently 

“wasted, with few students who learn 

specialized skills in these fields using 
them after graduation and high turnover in 
sectors where such skills are used. Although 
we invest heavily in training future work- 
ers in these fields, we do little to ensure that 
graduates remain in the STEM workforce 
and productively and meaningfully use the 
skills they have learned, he argues. 

In the book’s introduction, Skrentny sum- 
marizes the core arguments that are most 
often made for producing more STEM work- 
ers—that there is a shortage of such citizens, 
that it will help to ensure the country’s com- 
petitiveness, and that it will allow the US to 
respond more adeptly to various global cri- 
ses. Yet more than half of STEM graduates do 
not work in STEM occupations, a trend that 
has been “consistent since early in the first 
decade of the twenty-first century.’ So where 
do they go? Some leave for occupations 
with better pay, reveals Skrentny, but many 
employers also drive away STEM graduates 
through a “burn and churn” approach to the 
workforce. “It is a paradox that employers in 
the same sectors where complaints of worker 
shortages are the loudest are so willing to 
push them so hard that they leave in tears—if 
they are not fired first,’ he observes. 

Keeping up to date with the skills that 
are required to succeed in STEM jobs can be 
extremely challenging and likely contributes 
to some workers’ decisions to pivot to other 
fields. Skrentny claims that many employ- 
ers do not do enough to address the training 
needs of their workers, instead opting to treat 
them like interchangeable widgets that are 
discarded when they become obsolete. 

Many top tech companies and STEM em- 
ployers are also to blame for their own lack 
of workforce diversity, argues Skrentny. He 
devotes chapter 6 to accounts of underrep- 
resented and minoritized STEM workers, 
describing the challenges they have faced 
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in their careers. Many STEM employers, he 
notes, “engage in or tolerate practices that 
are driving women, underrepresented mi- 
norities, and older workers away while at the 
same time complaining that schools are fail- 
ing to give them enough STEM grads.” 

In the book’s final section, Skrentny 
contends that having the opportunity to 
contribute to morally meaningful work— 
for example, working on solving climate 
change—would go a long way to improving 
conditions for STEM workers and may even 
convince some to stay in the fields in which 
they trained. He concludes by arguing that 
government investment in such work could 
make a difference: “[MJore government 
funding for research and development in ar- 
eas that can solve our manifold global crises 
could help not only solve the problems that 
desperately need solving but also provide a 
lot of rewarding and meaningful employ- 
ment opportunities for STEM grads.” 

A major take-home from this book is that 
the K-12 education sector often fails to pro- 
vide students with skills that employers want 
and need. Skrentny insists that employers 
should do more to remedy such gaps, but 
it could be argued that schools should also 
work harder to ensure that STEM curriculum 
aligns with the needs of future employers. 

One might also take issue with Skrentny’s 
definition of “wasting” education. For ex- 
ample, mathematically precocious students 
may end up in STEM disciplines, but many 
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also go on to innovate in a wide array of other 
domains (7). Additionally, some scholars have 
reasoned that higher education is valuable 
in its own right, regardless of one’s specific 
training (2), and postsecondary training of 
any kind can signal to employers that a per- 
son is worth hiring (3). 

With the rise of artificial intelligence and 
other technologies, keeping one’s skills up to 
date is more difficult now than ever before. 
As Skrentny suggests, part of the solution to 
this problem likely lies with employers, who 
can do more to recruit, train, and retain a 
diverse array of individuals and to promote 
more-inclusive work cultures. At the same 
time, talented workers will always be in de- 
mand, and “burn and churn” will always be a 
part of employment when competition and a 
company’s bottom line determine its chance 
of survival. Although Skrentny does not have 
all the answers, he asks important questions 
that encourage readers to think more deeply 
about what a meaningful education is (and 
is for) and the nature of meaningful work. 
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Is the Seated Woman of Catalhéyiik a fertility goddess? Perhaps not. 
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The Patriarchs: 

The Origins of Inequality 
Angela Saini 

Beacon Press, 2023. 256 pp. 


Today's patriarchal systems may 
appear robust and inevitable, but 
history shows that they are neither 
permanent nor preordained. This week 
in a special bonus segment of this 
series, host Angela Saini discusses 
her own book, The Patriarchs, which 
dispels commonly invoked myths 
about gender inequality and reveals 
how gendered power structures are 
constantly being renegotiated and 
reasserted. bit.ly/3sdMqQS 
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Fossil fuels threaten 
northwest Africa 


The lower Senegal River Basin in 
Mauritania and Senegal is a unique net- 
work of water basins, floodplains, and 
sand dunes. The fossil fuel industry, 
including the company BP, established 

a presence in the region after major 
offshore gas fields were found (1). BP 
originally planned only an offshore liq- 
uefied natural gas (LNG) terminal, but 
the development of the much larger 
BirAllah gas field in Mauritania’s coastal 
basin prompted plans for an additional 
onshore LNG facility (2). Although any 
gas extraction poses environmental risks, 
the onshore facility directly threatens 
the region’s vast biodiversity and pro- 
tected areas. Despite the fact that nei- 
ther Mauritania nor Senegal sufficiently 
regulates the gas industry, the companies 
should live up to their promises to “take 
action to restore, maintain, and enhance 
nature” (3). 

The lower Senegal River Basin hosts 
exceptional biodiversity. The region 
endured harsh droughts in the 1970s and 
1980s (4), displacing rural communities to 
cities. To manage water supply for the food 
production that the growing urban popu- 
lations required, dams were constructed: 
The Diama Dam completed in 1985 and the 
Manantali Dam in 1989 (5). The dams pre- 
vented saltwater from flowing upstream 
but had adverse effects on the environ- 
ment. Wetlands dried up and turned into 
salt deserts downstream, and aquatic 
weeds infested waters upstream (5). 

Despite the degradation caused by the 
dams’ construction and use, the Senegalese 
and Mauritanian governments managed 
to restore a mosaic of wetland habitats 
(5). Djoudj National Bird Sanctuary, the 
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World Heritage site, was designated in 
1981; Diawling National Park was estab- 
lished in 1991; and Chott Boul Reserve 
was recognized around 2000—all are 
Ramsar sites (5, 6). The region received a 
United Nations Educational, Scientific and 
Cultural Organization (UNESCO) designa- 
tion of Man and Biosphere reserve in 2012 
(7), recognizing sustainable practices. 

The planned infrastructure—especially 
the onshore facility, the proposed location 
of which is within the Man and Biosphere 
reserve—will put this region’s hard-won 
environmental successes at risk. The fos- 
sil fuel industry should instead uphold its 
commitment to sustainable development 
as proposed in, for example, the International 
Finance Corporation Performance 
Standard 6 (8). Moreover, the authorities 
in both Mauritania and Senegal should 
hold the industry accountable, given that 
both countries have committed to the Food 
and Agriculture Organization’s (FAO’s) 
Canary Current Large Marine Ecosystem 
project (9), which aims to reverse further 
ecological degradation of the area and to 
safeguard natural resources and fisheries. 
The fossil fuel industry should avoid con- 
struction on the UNESCO Reserve and take 
action to enhance biodiversity conserva- 
tion, onshore and offshore. 
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FAIR data would alleviate 
large carnivore conflict 


The nature directives of the European 
Union have contributed to the recovery of 
large carnivore populations, which were 
decimated after centuries of persecution. 
However, in September, the European 
Commission claimed that wolves pose 

a danger to livestock and requested 

that anyone with any type of related 

data submit it as part of a review of the 
wolf’s conservation status (7). Requesting 
unvetted data instead of relying on the 
scientifically sound data on species 
conservation status regularly provided by 
each member state (2) upends established 
legal procedures. The EU must facilitate 
the collection of data on the livestock sec- 
tor and associated losses that adheres to 
FAIR guidelines: The data should be find- 
able, accessible, interoperable (integrated 
with other data), and reusable (3). With 
reliable data, the Commission can prop- 
erly assess the impact of large carnivores 
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and implement evidence-backed solutions. 

Losses due to large carnivore damage 
are not a threat to the livestock industry. 
Between 2012 and 2016, less than 0.06% of 
the over-wintering sheep stock on main- 
land Europe was lost annually as a result 
of predation (4). Besides economic com- 
pensation for predated extensive livestock, 
European farmers receive substantial 
subsidies, including specific funds for 
farming under difficult conditions (5). Yet 
economic losses from predation by large 
carnivores have been used to justify culling 
by hunting and lethal population manage- 
ment (/, 6), strategies that are ineffective at 
preventing livestock losses and can lead to 
increased damage (6, 7). 

Current EU legislation includes lethal 
control as a potential management 
tool under certain conditions, but the 
Commission seems to be inviting data that 
could justify more permissive guidelines. 
The consequences of disregarding scien- 
tific evidence and using unreliable data 
could be severe. Increased culling of large 
carnivores could hinder the connectivity 
needed to recover genetic variability of 
some isolated populations, compromising 
their long-term viability (8). In turn, the 
population reduction could compromise 
the role of these species in maintaining 
biodiversity and ecosystem functioning (9), 
including the control of disease dynamics 
of prey populations; the regulation of her- 
bivore densities, seed dispersal processes, 
landscape configuration, and stream mor- 
phology; and the fertilization of aquatic 
and terrestrial ecosystems [e.g., (J0)]. Large 
carnivores also mitigate damage to silvicul- 
ture and agriculture caused by herbivores, 
reduce wildlife-vehicle collisions by chang- 
ing prey density and behavior, and inspire 
wildlife-based tourism and nature-based 
education [e.g., (1J)]. 

There is an urgent need to promote 
coexistence between large carnivores 
and humans. Unlike culling, preventive 
measures, such as shepherds, guard- 
ing dogs, and enclosures, and effective 
compensation systems, such as condition- 
ing payment to farmers to prevention, 
have been shown to reduce damage and 
economic losses (6, 12). Reliably assessing 
damage and mitigation strategies requires 
implementing a coordinated European 
database with quality-controlled FAIR 
livestock predation data. 
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Australia’s carbon plan 
disregards evidence 


Australia’s commitment to climate change 
abatement lies partly in the reduction of 
grazing to increase above-ground woody 
biomass (J, 2). However, this strategy is not 
supported by scientific evidence, which 
shows that increasing—not decreasing— 
grazing leads to more trees and shrubs. 
Australia should replace efforts to reduce 
grazing with effective methods of seques- 
tering carbon. 

The Human-Induced Regeneration 
Scheme, a frequently used program under 
the Australian Carbon Credit Unit Scheme 
(1), claims that reduced grazing pres- 
sure from livestock and/or feral animals 
will regenerate even-aged native forest 
that grows to taller than 2 m. However, 
reducing or removing grazing on arid and 
semiarid rangelands does not result in an 
increase in woody biomass or an increase 
in the size of trees and shrubs taller than 
2 m over the next 15 years (3-6). On the 
contrary, increasing grazing pressure 
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reduces grasses, which liberates water 
resources and enhances woody plants. 
This phenomenon, known as woody 
encroachment or woody thickening (3-6), 
leads to a dominance of woody plants 
at the expense of herbaceous species. 
Increased rainfall and carbon dioxide, 
along with reduced frequency and magni- 
tude of fires, further increase woody plant 
growth in rangelands (3). 
The Human-Induced Regeneration 
Scheme is also inconsistent with the current 
understanding of the location of carbon 
pools in arid ecosystems. Ecological theory 
and empirical evidence reveal that the bulk 
of ecosystem carbon in drylands is stored in 
the soil (7)—from 7 to 100 times more than 
in the vegetation (8). Therefore, a focus 
on above-ground carbon—i.e., increasing 
woody biomass—is unlikely to havea major, 
impact on Australia’s carbon budget (9). 
Instead of the Human-Induced 
Regeneration Scheme, Australia should 
implement a system that rewards pastoral- 
ists and landowners for restoring native 
vegetation on naturally degraded and 
previously cleared land that was originally 
dominated by either woody plants or 
grasses (10). Restoration practices should 
be tailored to specific regions and the type 
of degradation. For example, active resto- 
ration of logged tropical forests has been 
shown to result in greater carbon accumu- 
lation than naturally regenerating forest 
(11). As articulated in the recent session of 
the United Nations General Assembly (12), 
science must inform the strategies put in 
place to address environmental challenges. 
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Seeing the light 


s seedlings emerge, their embryonic stems unfold and extend as 
they grow toward light. Phototropin blue light receptors detect 
the gradient of light, thus allowing the direction of growth to be 
determined. Nawkar et al. found that mutation of a transporter 
protein, ABCG5, caused defective seedling phototropism (see 
the Perspective by Whitewoods). The authors found that air 


ASTROPARTICLE PHYSICS 
A highly energetic 
cosmic ray 


Cosmic rays are charged particles 
from space. At low energies, they 
mostly originate from the Sun, 
whereas at high energies, they are 
expected to be emitted by nearby 
active galaxies. The Telescope 
Array Collaboration now reports 
the detection of a cosmic ray 
event with an energy of about 
240 exa-electron volts, more 
than a million times higher than 
that achieved by artificial particle 
accelerators. Such high-energy 
particles should experience only 
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Air channels between cells help seedlings sense and stretch toward sunlight. 
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small deflections by foreground 
magnetic fields, but tracing 
back the arrival direction shows 
no obvious source galaxy. The 
authors suggest that the fore- 
ground magnetic fields might be 
stronger than expected, or there 
could be unknown particle phys- 
ics at high energies. —KTS 
Science, abo50985, this issue p. 903 


GLYCOSYLATION 
Cross-coupling-like 
reaction for glycans 


Glycosylations are an important 
feature of many natural products, 


spaces between hypocotyl cells were impaired, preventing scattering 
of light across the tissue. Intercellular air spaces have been known to aid 
gas exchange and facilitate buoyancy in aquatic plants. The work here 
demonstrates an additional functional role for air spaces in setting up a 
light gradient and also provides insight into how they form. —MRS 


Science, adh9384, this issue p. 935; see also adl2394, p. 885 


but methods to install sugars 

can suffer from limitations on 
stereospecificity because many 
reactions proceed through an 
oxocarbenium intermediate in 
which stereochemistry is lost. 
Deng et al. developed a versatile 
palladium-catalyzed reaction for 
the glycosylation of phenols that 
resembles palladium-catalyzed 
aryl carbon—oxygen cross- 
coupling reactions. Palladium 
oxidative addition to an easily 
prepared ortho-iodobipheny! 
S-glycoside yields a complex that 
reacts with a wide range of pheno- 
lates through an S,2 mechanism 
to afford the glycosylated phenols 
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with inversion of stereochemistry. 
This reaction works well with, but 
it is not limited to, 2-deoxy sugars 
and can be performed in one pot 
with other palladium-catalyzed 
cross-couplings to yield complex 
O-glycosides. —MAF 

Science, adk1111, this issue p. 928 


SULFUR CYCLE 
The story in sulfur 


The sulfur isotope composition 
of pyrite found in marine sedi- 
ments and sedimentary rocks is 
often used to try to reconstruct 
the coupled cycles of carbon, 
oxygen, and sulfur. However, the 
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resulting interpretations can be 
complicated by the competing 
effects of physical and biological 
processes. Halevy et al. show that 
inorganic reactions and transport 
in depositional environments, 
rather than microbial influence, 
eads to the wide range of sulfur 
isotopic values observed. The 
observed increase in sulfate— 
pyrite isotope fractionation over 
most of Earth's history primarily 
reflects the effects of increasing 
marine sulfate concentration, 
except over the past 550 million 
years, when supercontinent 
breakup and assembly and 
variations in sea level were more 
important. Bryant et al. present a 
microanalytical method applied 
to individual pyrite grains that 
deconvolves the multiple signals 
influencing sulfur isotopes. They 
were able to determine both the 
microbial isotopic effects and 
inorganic fractionation produced 
by depositional conditions. Their 
approach will help in reassessing 
conflicting interpretations of the 
sulfur isotopic record. —HJS 
Science, adg6103, adh1215, 
this issue p. 912, 946 


Antipsychotics and an 
appetite hormone 


Unwanted side effects such as 
weight gain and insulin resistance 
are well-known consequences of 
certain antipsychotic drugs. Zhao 
et al. show that two such drugs, 
olanzapine and risperidone, 
elicited a rise in leptin in mice that 
both preceded and contributed 
to treatment-induced obesity and 
insulin resistance. Addition of a 
eptin-neutralizing antibody to 
the treatment regimen prevented 
many of these metabolic effects 
and also attenuated drug-induced 
ocal and systemic inflammation. 
—CAC 


Sci. Transl. Med. (2023) 
10.1126/scitransImed.ade8460 


Unleashing myeloid cells 
Myeloid cells are necessary for 
productive antitumor immunity 
but are also susceptible to immu- 
nosuppressive reprogramming 


SCIENCE science.org 


in the tumor microenvironment. 
Using mouse models of pan- 
creatic ductal adenocarcinoma, 
Wattenberg et al. found that com- 
bined targeting of the myeloid 
cell-activating receptors CD40 
and dectin-1 could unleash potent 
antitumor immunity against 
established pancreatic tumors. 
This treatment required T cells 
but was not associated with the 
classical T cell cytotoxicity and 
immune checkpoint pathways. 
—CO 
Sci. Immunol. (2023) 
10.1126/sciimmunol.adj5097 


Reducing noise 
Charge current in solids is car- 
ried by discrete entities called 
quasiparticles that give rise to 
associated shot noise. For the 
peculiar phase called the strange 
metal, however, this quasiparticle 
scenario is expected to break 
down, leading to a reduction in 
shot noise. Chen et al. tested this 
prediction by measuring shot 
noise in nanowires made of the 
heavy fermion material YbRh,Si, 
in the strange metal phase. In 
these samples, shot noise was 
indeed reduced compared with 
the values measured in a com- 
parable gold nanowire and with 
the theoretical expectations for a 
system of quasiparticles. —JS 
Science, abq6100, this issue p. 907 


Amide extension 
Traditionally, organic chemistry 
operates by assembling a carbon 
skeleton and then modifying 
its periphery. However, a flurry 
of recent research has instead 
focused on inserting or removing 
atoms into or out of preexisting 
carbon frameworks for more 
efficient pharmaceutical opti- 
mization. Zhang et al. report a 
reaction sequence for introduc- 
ing methylene chains of varying 
lengths between the carbony! 
and a-carbon of amides. Initial 
a-alkylation and installation of a 
directing group set the stage for 
rhodium-catalyzed cleavage of 
the carbon—carbonyl bond and 
subsequent isomerization. —JSY 
Science, adk1001, this issue p. 951 
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Long-term use of hair 
relaxants, which can 
contain endocrine 
disrupters, are 
associated with 
uterine cancer. 


te 
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Health costs of fashion 


eauty standards promote straight hair as the ideal 

for women. Therefore, sales for chemical hair relax- 

ers to straighten curly, thick hair have skyrocketed for 

decades, particularly among Black American women. 

These hair products can include carcinogens such as 

formaldehyde, as well as phthalates or parabens, which 
disrupt women’s endocrine systems and hormones. However, 
longitudinal research has rarely documented the health 
consequences of long-term exposure to chemical hair relaxers. 
Bertrand et al. examined data from the Black Women’s Health 
Study, which lasted from 1997 to 2019, and found an associa- 
tion between postmenopausal women who had previously 
used chemical hair relaxers regularly for more than 5 years 
and a 50% higher risk of uterine cancer. —EEU 


Environ. Res. (2023) 10.1016/j.envres.2023.117228 


Impacts of oversight and 


outrage 

High-profile killings by police 
can trigger calls for increased 
police oversight, but they also 
spark public outrage and change 
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public behavior around policing, 
which can confound attempts to 
estimate the impacts of police 
reforms. This feedback may 
explain why reforms could appear 
to overpolice the police, causing 
them to reduce effort and leading 
to increased crime. Rivera and 
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BIOCATALYSIS 
Natural advantages, 
engineered 


As nature's catalysts, enzymes 
are adaptable, specific, and 
interoperable. There have been 
important advances in the 
identification of transformations 
catalyzed by enzymes, and new 
methods have been developed 
for identifying, engineering, 
or creating de novo suitable 
enzymes for a wide range of 
applications. In a Review, Buller 
et al. collect this recent progress 
in biocatalysis research and 
look ahead toward increas- 
ing the use of computational 
tools, accelerating design test 
cycles, and expanding chemistry 
beyond what is familiar in nature. 
Engineered enzymes provide a 
green chemistry solution to the 
synthesis of complex organic 
molecules and can be used in 
the degradation of waste plastic 
and chemicals. -MAF 

Science, adh8615, this issue p. 899 


CRISPR 
Uncovering diverse 
CRISPR-Cas systems 


Microbial biochemicals sys- 
tems are incredibly diverse, and 
computational tools to analyze 
sequence data are essential in 
identifying new and valuable 
components for biotechnology 
development. Using an approach 
called deep terascale clustering, 
Altae-Tran et al. found more than 
200 new functional systems 
linked to CRISPR, a technol- 
ogy editing DNA. Some of the 
discovered genes are linked to 
precise DNA-editing systems 
that may enable safer therapeu- 
tic genome editing. The authors 
also identified a CRISPR-Cas 
enzyme, Cas14, which cuts RNA 
precisely. These discoveries may 
help to further improve DNA- 
and RNA-editing technologies, 
with wide-ranging applications 
in medicine and biotechnology. 
—DJ 

Science, adi1910, this issue p. 900 
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EVOLUTIONARY BIOLOGY 
Many peaks, but easy 
to climb 


How many mutations does it 
take to move from one genetic 
fitness peak to another ina 
fitness landscape? Papkou et 
al. performed mutagenesis to 
survey the combinatorial geno- 
typic space of nine nucleotides 
encoding three successive 
amino acids in a protein targeted 
by antibiotics in Escherichia coli. 
The authors found that most 
genotypes had low fitness, but 
that traveling between high 
fitness peaks required surpris- 
ingly few mutations. This work 
represents an exhaustive exami- 
nation of more than 260,000 
genotypes, surveying a nearly 
complete network of mutational 
paths to answer a long-standing 
question. —CNS 

Science, adh3860, this issue p. 901 


TISSUE MORPHOGENESIS 
Supracellular 
properties emerge 


Morphogens are secreted 
molecules that have criti- 
cal roles in the generation of 
tissue morphology, but how 
individual cell effects “scale up” 
to create supracellular struc- 
tures composed of hundreds 
to thousands of cells remains 
elusive. Focusing on avian skin, 
Yang et al. discovered that two 
morphogens enable material 
and mechanical properties 
that manifest at the supracel- 
lular level (See the Perspective 
by Banavar and Nelson). 
Functional effects of morpho- 
gens in large tissues can thus be 
understood by consideration of 
emergent supracellular proper- 
ties. —SMH 

Science, adg5579, this issue p. 902; 

see also adl2004, p. 880 


THERMOELECTRICS 
Improving at the interface 


Thermoelectric modules convert 
waste heat into electricity, but 
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finding materials that go in 
between the thermoelectric 
material and the electrodes is 
challenging because inappropri- 
ate interface materials can drive 
failure of the thermoelectric 
module. Xie et al. developed a 
screening strategy for isolating 
more chemically complex inter- 
face candidate materials (see 
the Perspective by Xu and Tian). 
Using this strategy, the authors 
identified a magnesium—cop- 
per—antimony semimetal that is 
an excellent interface material 
for a specific type of high-perfor- 
mance thermoelectric module. 
This approach should apply to a 
wide range of material chemis- 
tries. —BG 

Science, adg8392, this issue p. 921; 

see also adl2157, p. 882 


POLLUTION 
No coal is a healthy 
option 
The success of measures to 
mitigate environmental dam- 
age can be hard to assess. The 
advent of new modeling tools 
brings us closer to estimates 
that are reproducible and do not 
need expensive and time-con- 
suming computing. Henneman 
et al. found that coal-burning 
power stations emit fine 
particulates (PM, .) containing 
sulfur dioxide that are associ- 
ated with higher mortality than 
other types of PM, (see the 
Perspective by Mendelsohn and 
Min Kim). Using a reduced-form 
atmospheric model combined 
with historical Medicare data 
from the US, the authors identi- 
fied the coal-burning power 
plants associated with the 
greatest mortality and esti- 
mated the effect that closure 
or scrubber installation has 
had on reducing it. This type of 
approach can provide a rapid 
indication of the effectiveness 
of environmental protection 
measures to inform ongoing 
policy decisions. —CA 

Science, adf4915, this issue p. 941; 

see also adl2935, p. 878 
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MINERALOGY 
Removing disorder 
to grow 


Dolomite, a calcium magnesium 
carbonate, is one of the major 
minerals in carbonate rocks. 
However, growing the mineral 
under laboratory conditions has 
proven very difficult, resulting 
in the so-called “dolomite prob- 
em.” Kim et al. may have solved 
this problem by identifying 
the need to cycle the solution 
between undersaturated and 
supersaturated conditions (see 
the Perspective by Garcia- 
Ruiz). Cycling speeds up crystal 
growth 10 million times and 
may be imperative for making 
large amounts of dolomite. This 
observation is consistent with 
where we see dolomite forma- 
tion in nature: in coastal and 
evaporative environments. —BG 
Science, adi3690, this issue p. 915; 
see also adl1734, p.883 


NEUROGENESIS 
Adapting pool of adult 
neural stem cells 


Different subtypes of new 
neurons are born in the adult 
brain ventricular-subventricular 


zone from spatially distinct pools : 


of neural stem cells (NSCs). 
However, the physiological 
relevance of NSC diversity and 
specificity is unclear. Chaker 
et al. have revealed that during 
mouse pregnancy, multiple NSC 
pools are activated in mothers 
and generate specific olfactory 
bulb interneurons that func- 
tion around birth to modulate 
aspects of maternal care, 
including own-pup recognition, 
and then disappear as pups 
mature (see the Perspective 
by Kempermann). These 
results highlight how adult NSC 
heterogeneity might provide 
a substrate for adaptive brain 
plasticity in response to different 
physiological states. -MMa 
Science, abo5199, this issue p. 958; 
see also adl2399, p.881 
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PAIN 
Gut glia turn painful 
with inflammation 


The pain associated with gut 
inflammation can be debilitating, 
chronic, and difficult to manage. 
Morales-Soto et al. found that 
local glial cells may be major 
contributors to this inflamma- 
tory pain. Inducing inflammation 
in the gut of mice provoked 
enteric glia to secrete pros- 
taglandin E,, which activated 
receptors on gut-innervating 
sensory neurons that sensitized 
the neurons to stimuli that were 
otherwise not noxious. Activating 
the glia in the absence of 
inflammation did not stimulate 
neurons or produce visceral 
pain. —LKF 

Sci. Signal. (2023) 10.1126/scisignal. 

adgl668 
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Megafloods can be anticipated better by considering 
the history of hydrologically similar catchments. 
—S 


= 


Past as megatlood prologue 


everal recent high-profile flooding events have raised concerns about future damaging large- 
scale floods. Bertola et al. used river gauge observations recorded since 1810 to determine 
where megafloods have occurred previously and show that more recent megafloods could 
have been anticipated from these observations. By considering the continental scale instead 
of the more traditional local approach, the authors found that megafloods occur in places 
where there have been large floods before or in those that are hydrologically similar to where 
previous megafloods have occurred. This work should help to improve assessments of flood risk, 
especially in places susceptible to outlier events. —BG 


Nat. Geosci. (2023) 10.1038/s41561-023-01300-5 


Ba disentangled these effects 
by observing the impacts of 
oversight reforms resulting from 
two unexpected court rulings in 
Chicago in 2009 and 2014 that 
went largely unnoticed by the 
public. The reforms did not lead 
to less policing or higher crime, 
but did reduce public complaints 
about police misconduct. —BW 
Rev. Econ. Stat. (2023) 
10.1162/rest_a_01377 


Monolayer integrated 
nanophotonics 


Silicon remains the platform of 
choice for integrated photon- 
ics and optoelectronics, but 
further miniaturization will 
require new material platforms. 
Two-dimensional van der Waal 
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materials such as the transition 
metal dichalcogides offer tun- 
able optoelectronic properties. 
Ling et al. and Vyshnevvy et al. 
demonstrate the confinement 
and waveguiding of light within 
thin layers of MoS, and WS,, 
respectively. The ability to confine 
the light to just a fraction of the 
wavelength and guide it around 
the material is promising for 
developing the next platform of 
integrated nanophotonic circuits 
and devices. —ISO 
Optica (2023) 
10.1364/0PTICA.499059 
Nanoletters (2023) 
10.1021/acs.nanolett.3c02051 


Oocytes’ internal pantry 
One of the key functions of 
oocytes is to accumulate proteins 


and other materials necessary for 
the proper development of the 
zygote and, after successful fertil- 
ization, the embryo. Mammalian 
oocytes have long been known 
to contain structures called 
cytoplasmic lattices, which can be 
visualized by electron microscopy, 
but their function was not under- 
stood. Jentoft et al. show that 
PADI6 and SCMC, two proteins 
essential for lattice formation, 
are distributed throughout the 
lattice-filled cytoplasm and are 
physically integrated into the lat- 
tices. The two proteins are directly 
involved in the accumulation and 
storage of proteins in mammalian 
oocytes, including in humans, 
which explains why these struc- 
tures are essential for successful 
reproduction. —YN 
Cell (2023) 
10.1016/j.cell.2023.10.003 
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The pain of alcohol 
withdrawal 


Severe headaches are a conse- 
quence of alcohol withdrawal in 
addicted individuals, hampering 
the success of rehabilitation 
therapy. Unfortunately, there 
are no therapeutic options for 
these headaches. Son et al. 
used rodent models of alcohol 
withdrawal and showed that 
the mast cell-specific recep- 
tor MrgprB2 was responsible 
for the headache symptoms. 
Animals lacking this receptor 
failed to develop sensory neuron 
hypersensitivity and headache 
symptoms during withdrawal. 
The mechanism involved mast 
cell degranulation in the outer 
membranes of the brain. Drugs 
targeting MrgprB2 might repre- 
sent a therapeutic opportunity 
for preventing the painful head- 
aches associated with alcohol 
withdrawal. -MMa 
Neuron (2023) 
10.1016/j.neuron.2023.09.039 


Tolerating virulence 
Bat viruses, if they succeed in 
infecting humans, cause higher 
fatalities than most other zoono- 
ses. Remarkably, bats themselves 
appear to be little affected by 
the diverse viruses they host. It 
is hard to explain this effect by 
phylogenetic distance between 
candidate hosts, so Brook et 
al. applied a nested modeling 
framework to understand the 
reservoir host and virus traits 
that might predict virulence 
in alternative hosts. The main 
driver for the evolution of virus 
tolerance in bats appears to be 
flight. The physiological costs of 
flight have apparently selected 
for suppression of inflamma- 
tory responses, viral tolerance, 
and longevity. Viral tolerance 
does not mean any reduction in 
viral load, so high-growth-rate 
pathogens will emerge that have 
the greatest chances of trans- 
mission. In a new, immune naive 
host, this could mean exceptional 
virulence. —CA 
PLOS Biol. (2023) 
10.1371/journal.pbio.3002268 
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From nature to industry: Harnessing enzymes 


for biocatalysis 


R. Buller, S. Lutz, R. J. Kazlauskas, R. Snajdrova, J. C. Moore, U. T. Bornscheuer* 


BACKGROUND: Biocatalysis is an approach to 
synthetic chemistry in which enzymes carry 
out chemical reactions. Historically, enzymes 
from natural sources have been used to break 
down oils and proteins in laundry detergents, 
produce semisynthetic antibiotics, and create 
simple chiral precursors for the pharmaceuti- 
cal industry. In the past 5 years, the number of 
available protein sequences has increased by a 
staggering 20-fold, accelerating the discovery 
of enzymes with useful activities and proper- 
ties. Directed evolution, the cornerstone of our 
ability to tailor enzymatic properties, is allowing 
researchers to tailor enzymes for the synthesis of 
complex molecules, the modification of biol- 
ogical therapeutics, and the breakdown of 
plastic waste. Machine learning-driven protein- 
structure prediction, coupled with advances in 
automation and high-throughput screening, is 
further advancing our ability to create enzymes 


Directed evolution 


Redesigning enzyme 
mechanism 


In silico and experimental tools 


Computational design 


| Wild-type enzyme panels — 


Re- and upcycling 


with desired function. The ability to add non- 
biological catalytic elements to enzymes means 
that enzyme engineers no longer have to rely 
solely on natural catalytic machinery. Scientists 
have thus gained the capacity to redesign, 
reimagine, and repurpose enzymes. Illustrative 
applications include enzyme cascades to man- 
ufacture the antivirals islatravir and molnupir- 
avir, biocatalysts that can generate and control 
radicals, and enzymes that exploit photo- 
catalysis to affect stereocontrolled C-C couplings, 
hydroaminations, or Diels-Alder reactions. 


ADVANCES: The past 5 years have witnessed a 
surge in the development of data-driven tools 
enabling the accelerated discovery, engineering, 
and deployment of enzymes for applications 
in chemistry, medicine, and food technology. 
By leveraging enzymes’ ability to control the 
environment of a chemical reaction, scientists 


(Chemo)enzymatic cascades 
to value-added products 


Biotherapeutics 


Applications 


The accelerating development of biocatalysis. |n silico and experimental tools developed in the last 
decade allow the fast creation of tailored enzymes. In addition to the concise synthesis of complex small 
molecules in elegant (chemo)-enzymatic cascades, newly accessible applications include enzymatic 

DNA synthesis, the generation of therapeutic oligonucleotides, and up- and recycling of plastic waste. 
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have developed successful platforms for nee 
construction of complex small molecules f-2.- 
suitable starting materials or even COs. In ad- 
dition to their use in (chemo)-enzymatic cas- 
cade reactions, precisely tailored biocatalysts 
can manufacture new therapeutic modalities, 
such as antisense oligonucleotide therapeutics, 
and bioconjugates. DNA synthesis, which still 
relies on phosphoramidite chemistry, is being 
reinvented with template-independent deoxy- 
nucleotidyl transferases to make the process 
faster and cheaper. Machine learning, already 
dominating protein-structure prediction and 
design, is finding applications in enzyme en- 
gineering, including improvement of functions 
such as enantioselectivity, activity, and stability. 


OUTLOOK: In the decade ahead, biocatalysis 
research and applications will continue to pro- 
fit from advances in data mining, machine 
learning, and DNA reading and writing. The 
combinatorial design of enzymes and the ease 
with which new enzyme variants can be ex- 
perimentally generated lends itself to train 
data-intense machine-learning algorithms. 
Ideally, the sequence-function data of variants 
screened in a directed-evolution campaign could ‘ 
be used to predict which variants to evaluate 
next. Machine learning works best with clean 
and reproducible data, mandating the stan- 
dardization and reporting of data and the fur- + 
ther development of experimental techniques, 
including molecular biology methods, automa- 
tion, and high-throughput screening assays. 

The toolbox of drug discovery is expanding 
beyond traditional small molecules (molecular 
weights <500 g/mol) to include RNA therapeu- 
tics, protein degraders, cyclopeptides, antibody 
drug conjugates, and gene therapy. Consequent- 
ly, the synthetic complexity of lead molecules 
and clinical candidates is increasing with a 
growing percentage of chiral and beyond-rule- ‘ 
of-5 molecules. Enzymatic synthesis will play a 
key role beyond its present impact in small- 
molecule drug discovery and development. 

The discovery of new enzyme families and the 
rational design of new enzyme functions will 
expand the toolbox of available biocatalysts. 
Development and deployment of retrosynthetic 
tools containing enzymatic reactions will be an 
important step toward democratizing biocatal- 
ysis and making it available to the nonexpert. 
Fueled by these innovations, nature’s catalysts 
will be profitably used to address current chal- 
lenges, including the fight against diseases, 
provision of affordable clean energy, and reduc- 
tion of industrial and consumer waste. ® 


The list of author affiliations is available in the full article online. 
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From nature to industry: Harnessing enzymes 


for biocatalysis 


R. Buller’, S. Lutz’, R. J. Kazlauskas?, R. Snajdrova’, J. C. Moore®, U. T. Bornscheuer™* 


Biocatalysis harnesses enzymes to make valuable products. This green technology is used in countless 
applications from bench scale to industrial production and allows practitioners to access complex 
organic molecules, often with fewer synthetic steps and reduced waste. The last decade has seen an 
explosion in the development of experimental and computational tools to tailor enzymatic properties, 
equipping enzyme engineers with the ability to create biocatalysts that perform reactions not present in 
nature. By using (chemo)-enzymatic synthesis routes or orchestrating intricate enzyme cascades, 
scientists can synthesize elaborate targets ranging from DNA and complex pharmaceuticals to starch 
made in vitro from CO2-derived methanol. In addition, new chemistries have emerged through the combination 
of biocatalysis with transition metal catalysis, photocatalysis, and electrocatalysis. This review highlights 
recent key developments, identifies current limitations, and provides a future prospect for this rapidly 


developing technology. 


o innovate in synthetic chemistry, aca- 

demic and industrial scientists increas- 

ingly apply enzymes to make simple and 

complex functionalized (bio)molecules 

(1). Inspired by the precise control that 
enzymes can exert over reaction outcomes, 
chemists are using biocatalytic transforma- 
tions that complement or even substitute for 
more traditional chemical routes (2). However, 
as synthetically relevant reactions rarely have 
a counterpart in nature, the challenge in bio- 
catalysis is the identification of a suitable en- 
zyme catalyst for the desired application. Early 
uses of biocatalysis relied on accessible wild- 
type enzymes used in food or other industries 
to produce laundry detergents, semisynthetic 
antibiotics, and simple chiral precursors for 
the pharmaceutical industry (3-5). Although 
such repurposing still occurs occasionally, most 
new applications require the discovery of en- 
zymes with distinct reactivity or the engineer- 
ing of existing enzymes to catalyze the desired 
reaction, accept the desired substrate, and be 
stable and active at the desired application 
conditions. The computational and experimen- 
tal advances of the past few years have sped 
and simplified the tailoring of biocatalysts to 
access a rapidly growing set of chemical trans- 
formations. Today, strategies for the development 
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of performant biocatalysts include screening 
natural diversity to discover a desired enzyme 
activity, engineering biocatalysts to alter the 
substrate range, redesigning mechanisms to 
create previously unknown reactivity, and 
computationally designing enzymes de novo 
(Fig. 1). These strategies now allow the field to 
be much bolder in choosing which reactions 
are attempted enzymatically. Biocatalysts are 
being redesigned, reimagined, and repurposed 
to grant access to the desired targets with 
great efficiency. 

Here, we evaluate the recent experimental 
and computational developments that led to 
this boost in biocatalysis and its applications 
(Fig. 2). Highlighted experimental innovations 
include optimized strategies to read and write 
DNA, advanced tools for automation and high- 
throughput screening, and the ability to incor- 
porate novel catalytic elements into enzymes, 
including noncanonical amino acids. Compu- 
tational progress is reflected in improved ac- 
cess to suitable enzyme-encoding sequences, 
machine learning-based methods to accurately 
predict protein structures, and data-driven tools 
to guide enzyme engineering in the creation 
of information-enriched, functional variant 
libraries. These achievements have translated 
into a variety of applications, including (chemo-) 
enzymatic cascade reactions, new-to-nature 
chemistries, enzymatic plastic degradation, 
and the synthesis of biologics and therapeutics 
(Fig. 2). 


Developing biocatalysts 


There are several approaches that can be used 
to find or create an enzyme with the desired 
catalytic activity (Fig. 1) (6). First, the target 
activity could exist in nature but needs to be 
discovered. High-quality, low-cost sequencing 
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of DNA has now revealed complete genomes 
of species from all over the globe (7, 8), as well 
as DNA fragments from metagenome samples. 
The number of available protein sequences 
increased more than 20-fold in the last 5 years 
(2023, >2.4 billion (9, 10); 2018, ~123 million 
(11). Searching these sequences reveals many 
putative enzymes, some with predictable ac- 
tivities and many that are of unknown func- 
tion. Combining this sequence data with 
computational tools such as sequence simi- 
larity networks (72), phylogenetic analyses (13), 
and protein-structure prediction improves the 
precision of the search. 

In a representative example, polyethylene 
terephthalate (PET) hydrolases (PETases) to 
degrade PET plastics were found through clas- 
sical screening (1/4), bioinformatics searches 
(15), and sequencing of environmental sam- 
ples (16). Alternatively, legacy enzyme collec- 
tions of enzymes that proved useful in the 
past are an invaluable resource for identifying 
suitable biocatalysts or advanced starting points 
for engineering projects. Commercially avail- 
able enzyme screening kits and private libraries 
of enzyme variants for frequently used enzymes 
such as lipases, esterases, ketoreductases, and 
transaminases can considerably shorten time- 
lines toward yielding biocatalysts with the de- 
sired target activity. More recently, advances 
in protein structure prediction (see “Compu- 
tational tools” section below) have further im- 
proved the search for suitable enzymes in 
sequence databases. For example, in a search 
for dehalogenases, a sequence similarity search 
identified 2905 putative target enzymes (17). 
Subsequent analysis of homology models— 
which considered the size of the active site, 
the presence of catalytic residues, and tunnels 
to access the active site—narrowed the search 
to 45 genes, 40 of which yielded catalytically 
active dehalogenases. 

The most common strategy to access a de- 
sired enzymatic function is repurposing of an 
existing enzyme that fits many, but not all, 
requirements for the new application. Often 
an existing enzyme catalyzes the desired re- 
action but does not accept the desired sub- 
strate or does not produce the desired product 
isomer. In this situation, protein engineering is 
required to shift the substrate or product pref- 
erence of the enzyme. For example, the redesign 
of an aspartase enabled the hydroamina- 
tion of a,B-unsaturated carboxylic acids to make 
B-amino acids (8). The target substrates con- 
tained a hydrophobic substituent at the position 
corresponding to the a-carboxylate of aspartic 
acid. The design replaced four mostly polar 
amino acids in the binding site with hydro- 
phobic amino acids. In another case, triggered 
by the need for selective halogenation reac- 
tions, researchers expanded the substrate 
range of Fe(II)/o-ketoglutarate-dependent hal- 
ogenases to include nonnatural substrates 
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Fig. 1. Strategies for bio- 
catalyst development. 
(Top left) Screening 
natural diversity from 
environmental samples or 
from enzyme libraries can 
aid in discovering the 
desired enzyme activity. 
(Top right) Directed evolution 
starts with an enzyme that 
has a small amount of 
performance on a desired 
reaction and tunes the 
enzyme to work well under 
desired reaction conditions. 
Substrate walking by means 
of directed evolution 
expands that capability by 
starting with an enzyme 
that is mechanistically 
competent to catalyze a 
desired reaction (but 
doesn't actually catalyze it) 
and engineers the enzyme 
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to execute the reaction on the target substrate to make a desired product (i.e., alters its substrate range). (Bottom left) Redesigning the enzyme mechanism exploits 
chemical intuition to create new chemical functionality—for example, through cofactor repurposing and the addition of new prosthetic groups. (Bottom right) 

Computational design equips inert protein structures with enzymatic function. All images from stock.adobe.com, except where noted. Circled images, clockwise from ‘ 
bottom left: images 1 and 2, ylivdesign; image 3, Lifeking; image 4, ylivdesign. Images in boxes, clockwise from bottom left: box 1, by author; box 2, images 1 to 3 
(left to right) by davooda, image 4 by SkyLine; box 3, Lifeking; box 4, by author. 


(19-21), such as the macrolide soraphen A, a 
potent fungicide. By using algorithm-aided 
engineering, the apparent turnover number 
(Keat) for soraphen was improved >100-fold, 
turning the halogenase into a suitable cat- 
alyst for aiding in structure-activity relationship 
studies in medicinal chemistry. The halogenase 
also accepts other anions, including azide, ni- 
trate, and nitrite, enabling the creation of a 
wider range of products (22). In another ex- 
ample the selectivity of a protease was altered 
to degrade gluten peptides as a potential sup- 
plement for patients with celiac disease (23). 
The starting protease favored hydrolysis af- 
ter a Pro-Arg/Lys sequence, whereas gluten 
contains many Pro-Gln-Gln/Leu sequences. 
Hydrolysis was desired between Gln and 
Gln/Leu residues. The researchers introduced 
eight substitutions, two in the new Gln site, 
one in the Gln/Leu site, and five more distant 
substitutions to stabilize the protein. The mod- 
ified protease is currently undergoing clin- 
ical trials. 

The concept of “substrate walking”—evolving 
an enzyme from transforming its natural sub- 
strate to accepting an industrially relevant sub- 
strates was rarely applied (24, 25) a little over a 
decade ago. The evolution of a transaminase to 
manufacture sitagliptin (26) is an early exam- 
ple. Today, Pictet-Spenglerases that use benz- 
aldehyde (27), flavin-dependent halogenases 
active on high-molecular weight indoles and 
carbazoles (28), opine dehydrogenases (reduc- 
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tive aminases) that use amines and ketones as 
opposed to amino- and ketoacids (29), and 
transketolases that use nonpolar aromatic 
substrates (30) have all been created through 
substrate walking. 

In more difficult cases, no enzyme with the 
desired reactivity exists, but an enzyme with a 
mechanistically similar catalytic activity is 
known. Combining chemical reasoning with 
protein engineering can extend the natural 
catalytic activity to the desired activity (6, 37). 
One example is the expansion of the catalytic 
repertoire of cytochrome P450 monooxygenases 
to catalyze carbene and nitrene reactions. Their 
oxidase mechanism involves an iron porphyrin 
oxo (Fe=O) intermediate, and it was reasoned 
that iron carbene [Fe=C(R1)(R2)] and iron ni- 
trene (Fe=NR) species might form similarly 
given the appropriate reactive substrate (32, 33). 
Initial experiments revealed inefficient reac- 
tions, but optimization by directed evolution 
and replacement of the proximal thiolate heme 
ligand with the more weakly donating serine— 
yielding the so-called P411 scaffolds—led to 
catalysts capable of new-to-nature chemistry, 
including cyclopropanation, cyclopropena- 
tion, Si-C bond formation, B-C bond forma- 
tion, C-H insertion, and alkyl transfer (Fig. 3) 
(34), as well as aziridination, sulfide imida- 
tion, C-H amidation, and C-H amination (Fig. 
3) (35). These examples are enzyme-catalyzed 
reactions in vitro, but this new-to-nature carbene- 
transfer chemistry was also used to extend 
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biosynthesis in vivo. Engineered Streptomyces 
strains biosynthesized a carbene-transfer re- 
agent, azaserine, as well as the acceptor styrene, 
to yield unnatural cyclopropanes (36). P450 
monooxygenases can also be repurposed by 
following other strategies. For example, the 
P450 monooxygenase from Labrenzia agreggata 
was fashioned into a ketone synthase capable 
of the direct oxidation of internal arylalkenes 
to ketones by harnessing highly reactive car- 
bocation species as key intermediates (37). To 
cite another example, researchers found that 
P450-based radical cyclases could be developed 
through the exploitation of a metalloredox 
strategy to catalyze stereoselective atom-transfer 
radical reactions yielding substituted y-lactams 
(38) or arenes (39). 

Changes in catalytic activity can also be more 
drastic; for example, a reductase was extended 
to a cyclase that forms C-C bonds through 
radical intermediates. Flavin-dependent “ene” 
reductases normally catalyze the reduction of 
electronically activated alkenes through the 
stepwise addition of H,—the first step occur- 
ring through a hydride transfer (a two-electron 
transfer) from the flavin hydroquinone, followed 
by the second step, a proton transfer from a 
conserved tyrosine residue (Fig. 3). In other 
enzymes, flavins reduce substrates through 
two single-electron transfers, which creates 
radical intermediates. By replacing the ac- 
tivated alkene substrate of an ene reductase 
with an o-bromo ketone, which is a radical 
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Fig. 2. Advances in experimental and computational tools have broadened the range of applications in biocatalysis. All images from stock.adobe.com, except 
where noted. Clockwise from bottom left: image 1 (Biologics and therapeutics), Sir.Vector; image 2, by author; images 3 and 4, ylivdesign; images 5 and 6, Artco; 
image 7, ylivdesign; image 8, muhamad; image 9, ylivdesign; image 10, davooda; image 11, Artco; image 12, RaulAlmu; image 13, ylivdesign; image 14, Skyline. 


precursor, the ene reductase changes its mech- 
anism and transfers a single electron. Rapid loss 
of bromide yields the corresponding o-ketonyl 
radical. In an appropriately selected substrate, 
this radical may cyclize, forming a C-C bond and 
abstracting a hydrogen atom from the flavin 
semiquinone (40). The surrounding active site 
guides the formation of new stereocenters in 
the product. Thus, the resulting reaction is an 
enantioselective reductive cyclization. 

In other cases, radical precursors are less 
reactive and require light irradiation to initi- 
ate the single-electron transfer. In photobio- 
catalysis, a cofactor or amino acid within the 
protein active site is photoexcited to promote 
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the electron or energy transfer required to con- 
vert starting materials to desired products. 
Only three natural enzymes follow this syn- 
thetic logic: fatty acid photodecarboxylase (41), 
DNA photolyase (42), and protochlorophyllide 
reductase (43). Whereas the latter two en- 
zymes have less synthetic relevance, fatty-acid 
decarboxylases have been studied to catalyze 
the hydrodecarboxylation of fatty acids, a redox- 
neutral reaction that leads to alkanes, ren- 
dering them promising catalysts for biofuel 
production (44) and chemical building-block 
synthesis (45). 

Apart from radical cyclizations of well-chosen 
organohalides (46), the photobiocatalysis ap- 
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proach similarly allows the alkylation of arenes 
(47), the asymmetric cross-electrophile coupling 
of alkyl halides and nitroalkanes (48) (Fig. 3), 
and hydroaminations through the generation 
of amidyl radicals (49). Nonenzymatic photo- 
redox catalysts can also generate substrates 
for enzymes to create new reactions. Adding 
xanthene-based photocatalysts enabled an 
ene reductase to catalyze an enantioselective 
deacetoxylation (50). In another example, non- 
canonical amino acids were synthesized by 
using separate photocatalysis and enzymatic 
steps. The photocatalysis generates an alkyl 
radical from an alkyl trifluoroborate precursor. 
In the same solution, a modified tryptophan 
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Fig. 3. Examples of redesigning the enzyme mechanism. (A) (Top) 
Oxygenation catalyzed by wild-type cytochrome P450 monooxygenase. The 
structural similarity of Fe oxenes and Fe nitrenes inspired the use of synthetic 
reagents to carry out new chemistries with engineered P450 enzymes. (Bottom left) 
These so-called P411 enzymes, which contain a serine residue as proximal ligand 
(denoted as X), were evolved to assemble C-C bonds through sp® C-H 
functionalization (34) and to aminate at benzylic and allylic (not shown) positions 
to yield (bottom right) enantioenriched, unprotected primary amines (35). Piv, 
pivaloyl; Tf, trifluoromethanesulfonyl. (B) (Top left) Asymmetric double-bond 
reduction catalyzed by an ene reductase through a hydride transfer (two-electron 
transfer) from flavinmononucleotide (FMN) followed by a proton transfer from 
a conserved tyrosine residue (not shown). By applying photocatalysis, the 
catalytic machinery of the ene reductase can be repurposed when radical precursors 


withdrawing group. (Top right) Thermally allowed [4+2] cycloaddition between 
4-carboxybenzyl-trans-1,3-butadiene-1-carbamate and N,N-dimethylacrylamide 
catalyzed by a designed Diels-Alderase (155). (Bottom left) Irradiation initiates 

a single-electron transfer from the flavin hydroquinone cofactor and facilitates the 
formation of an a-ketonyl radical, which can engage in stereoselective 

sp°-sp* cross electrophile couplings in the confines of the engineered enzyme 
scaffold (48). (Bottom right) Installation of the noncanonical amino acid 
4-benzoylphenylalanine (green) by genetic-code expansion expands the reaction 
scope of the Diels-Alderase. Upon irradiation of the designer enzyme, the ncAA 
allows for triplet energy transfer to appropriately selected substrates, giving 
access to thermally forbidden reactions, such as intra- and bimolecular (not 
shown) [2+2] cycloadditions with high stereoselectivities (156). Structural 
illustrations are adapted from the PDB (PDB IDs: 21J2, cytochrome P450gys3; 


are used as substrates (as shown here, an a-chloro ketone). EWG, electron 


synthase catalyzes the dehydration of serine to 
form an enzyme-bound pyridoxyl-5’-phosphate 
aminoacrylate intermediate. Quenching of the 
radical and release of the intermediate yields a 
noncanonical amino acid. Adjusting the shape 
of the active site fine-tuned the enantioselec- 
tivity of all described reactions for up to three 
stereocenters (57). 

Instead of modifying or extending exist- 
ing enzymatic activities, researchers have also 
computationally designed new catalytic activi- 
ties into a protein scaffold. This de novo design 
approach identifies the transition state of the 
desired transformation and then builds a bind- 
ing site to stabilize it, reducing the problem of 
biocatalysis to one of molecular recognition. 
Although de novo enzyme design has created 
protein catalysts for model transformations, 
including proton-transfer, bimolecular-aldol, 
and Diels-Alder reactions (52). The initial ac- 
tivities of the designer catalysts were low, 
but directed evolution increased their cata- 
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lytic activities. For example, directed evolu- 
tion of a designed retroaldolase increased 
its catalytic activity from barely detectable 
to that of natural enzymes. The directed evo- 
lution introduced 31 substitutions, leading to 
a million-fold increase in activity (53). More 
recently, a deep learning-based approach 
generated large numbers of idealized protein 
structures and the sequences that encode it. 
The diverse scaffolds were employed to de- 
sign an artificial luciferase. Three substitu- 
tions introduced by site-saturation mutagenesis 
yielded a 100-fold-higher photon flux than 
did the parent design (Keat/Km = 10° M7! s™) 
(54) (where Kat/Km is the catalytic efficiency 
and Ky, is the Michaelis constant). Although 
de novo design of enzymes with activities that 
rival their natural counterparts remains a major 
challenge at present, these studies have im- 
proved our understanding of how sequences 
fold into proteins and of how basic enzyme 
activity is created. De novo enzyme design will 
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continue to grow as our understanding of the 
sequence-function relationship in enzymes im- 
proves, protein design methods mature, and 
computational power increases. 


Experimental tools 


Regardless of the strategy chosen to identify a 
suitable biocatalyst, laboratory work is always 
required to produce and, if needed, to tailor 
the target enzyme. The range of experimental 
tools available to protein engineers is con- 
sistently expanding, whether by lowering the 
cost of synthetic genes, speeding up individual 
directed-evolution cycles, or allowing for intro- 
duction of new catalytic elements to be added 
to proteins. 

Biocatalyst development starts with a DNA 
construct encoding the enzyme of interest, 
which requires artificial DNA synthesis. Present 
DNA synthesis uses phosphoramidite chemistry 
developed 40 years ago (55); however, DNA 
synthesis performed with enzymes (56-58) 
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of point mutations even more quickly (periwinkle curve). Microfluidics-based ultrahigh throughput screening has yielded 100-fold improvements per round (76). 
N indicates the number of evolution cycles. In this overview, it is assumed that one evolution cycle can be carried out per month. 


yields higher-quality DNA and may be faster 
and cheaper, thus representing a biocatalytic 
solution in and of itself. Terminal deoxynu- 
cleotidyl transferases (TdTs) (Fig. 4) polymer- 
ize deoxynucleoside triphosphates (dNTPs) to 
the 3'-end of a DNA sequence in a template- 
independent fashion. Using dNTPs with a 
blocked 3’-hydroxyl stops TdT action after a 
single nucleotide incorporation, hence con- 
trolling an otherwise runaway polymerization. 
Native TdTs react slowly with blocked dNTPs, 
so improved TdTs were engineered for this 
application (59, 60). TdT-based approaches 
have driven DNA oligonucleotide synthesis to 
(literally) new lengths of up to 1000 nucleo- 
tides, enabled by incorporation efficiencies of 
>99.6%, reviving the promise for rapid, single- 
run whole-gene synthesis. Milder aqueous re- 
action conditions also benefit DNA quality and 
greatly improve the overall process sustainabil- 
ity, opening up opportunities for gene editing 
and diagnostic applications. TdTs can also as- 
semble short synthetic RNA fragments. Such 
sequences, containing modified nucleotides in 
addition to the standard RNA building blocks, 
hold substantial therapeutic potential as anti- 
sense oligonucleotides and small interfering 
RNAs (67-63). 

Despite the advances in enzymatic DNA syn- 
thesis, most laboratories still rely on traditional 
molecular biology strategies to create large, 
randomized libraries through methods such as 
error-prone polymerase chain reaction (PCR) 
or the construction of a few specific variants 
through site-directed mutagenesis by using a 
chemically synthesized gene as template. Build- 
ing libraries of predefined enzyme variants 
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remains expensive and challenging with these 
traditional approaches. “Oligo-pools,’—which 
consist of a few hundred up to 1000 distinct 
polynucleotides with a length of about 300 base 
pairs—are a less-expensive alternative solution 
(0.00001 to 0.001 USD per nucleotide, depend- 
ing on length, scale, platform, or vendor) (64). 
Despite disadvantages such as truncated DNA 
molecules and high error rates (65), the oligo- 
pool option may be more cost-effective than 
the degenerate or reduced codon-coverage 
primers typically used for library construc- 
tion and may allow more flexible library de- 
sign (66). 

The cornerstone of protein engineering 
(1, 3, 67, 68), the Nobel prize-winning strategy 
of directed evolution, relies heavily on such 
randomized gene libraries. When directed evo- 
lution appeared three decades ago, error-prone 
PCR was the driver of sequence variability, re- 
sulting in round-over-round accumulation of 
single mutations and an approximated two- 
fold improvement of enzyme performance per 
round of evolution (69) (Fig. 4, light blue curve). 
The contemporaneous addition of recombina- 
tion techniques upgraded the rate of improve- 
ment to between 2.75 and 4 per evolution cycle 
(Fig. 4, dashed curves) by incorporating multiple 
mutations per round, reflecting the state of the 
art over the past two decades. Computational 
and molecular biology efforts have improved 
mutation prediction and accelerated enzyme- 
variant creation so that the screening burden 
can be reduced, but not to the point that screen- 
ing cycles or improvements per cycle have been 
greatly altered from these historical norms (70). 
Additionally, protein engineers continue to in- 
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crease the complexity of attempted evolution 
campaigns. Together, these developments point 
to an increasing bottleneck in biocatalysis: 
higher demand for protein-engineering re- 
sources and a lack of improvement in the way 
evolution experiments are performed. Tech- 
nical innovations are on the horizon, however. 
Cell-free protein-expression technologies pro- 
vide all the necessary components to transcribe 
and translate many DNA sequences into func- 
tional proteins, avoiding the time-consuming 
steps of cell transformation, growth, and in- 
duction (77). Using this technology to produce 
enzyme libraries for testing in evolution exper- 
iments has the potential to accelerate the time to 
complete a round of evolution by ~1.6-fold (Fig. 
4, green curve). Turning to the biological world, 
continuous-evolution strategies tie the success of 
a desired enzyme activity to the growth rate of a 
producing microbe and simultaneously provide 
the microbe with a mechanism of introducing 
mutation into the DNA coding for the desired 
protein only (72, 73). Evolution of the desired 
protein then occurs in continuous culture, which 
circumvents the same laboratory manipulations 
required for cell-free protein expression and in 
addition avoids the man-made construction of 
new variant libraries, leading to an eightfold 
increase in the rate of evolution performance 
compared with error-prone PCR [and a five- 
fold improvement compared with current state 
of the art (Fig. 4, periwinkle curve)]. For ex- 
ample, mRNA display libraries (74) have gen- 
erated new proteins, new types of catalytic 
activity, and binding proteins with very high 
affinity for their targets. Similarly, microfluidic 
technologies have demonstrated the ability to 
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screen large library sizes, which positively changes 
the nature of directed-evolution experiments 
(Fig. 4, black curve) (75). Encapsulating cells into 
picoliter-sized droplets, along with a lysis reagent 
and an analytical readout, has allowed scientists 
to assay thousands of samples per second, which 
is on par with the rates achieved by fluorescence- 
activated cell sorting (FACS) techniques. Be- 
cause millions of samples can be analyzed, 
libraries with multiple mutations per sequence 
can be screened deeply enough to uncover rare 
events, reaching 100-fold improvements per 
evolution round or greater (76). Although lack of 
commercialized hardware, relatively high imple- 
mentation costs, and technical complexity has 
so far limited the use of such tools to expert 
practitioners, these strategies show promise 
for managing the longer evolution timelines 
required for increasingly complex targets. 

Other emerging tools that include useful in 
protein engineering include cryo-electron mi- 
croscopy (cryo-EM), which has been profit- 
ably employed to guide improvement of a 
nitrilase (77), and microcrystal electron diffrac- 
tion (MicroED), which supported the mecha- 
nistic investigation of carbene transfer in a 
designed protein (78). Both methods enable 
gathering of mechanistic insights and struc- 
tural data complementary to x-ray crystallog- 
raphy, which nevertheless remains an important 
tool for generating high-resolution structures, 
especially of smaller enzymes. 

Complementing the biocatalysts obtained 
by enzyme-engineering campaigns, some ap- 
plications require researchers to equip pro- 
teins with new-to-nature functionalities. This 
is achieved through the incorporation of non- 
canonical amino acids (ncAAs) by using either 
amber stop-codon suppression or promiscuous 
aminoacyl tRNA synthetases (aaRSs), which 
expands the repertoire of available catalytic 
elements beyond those encoded in the 20 pro- 
teinogenic amino acids (79). ncAAs can be 
harnessed to tune enzyme properties, reveal 
mechanisms in complex catalytic machineries, 
and create enzymes with functions not found 
in nature. 

An illustrative example of tuning enzyme 
properties by incorporating ncAAs is the in- 
troduction of Ns-methylhistidine (NMH) as 
proximal ligand into the heme protein ascorbate 
peroxidase, which led to substantial increases 
in turnover number without compromising 
catalytic efficiency (80). Similarly, introduc- 
ing NMH modulates the reactivity of com- 
pound II in cytochrome c peroxidase (81, 82) 
and enhances the promiscuous peroxidase 
(83) and cyclopropanation (84, 85) activities of 
myoglobin. Most recently, this noncanonical 
nucleophile was introduced into a designer 
enzyme created for the Morita-Baylis-Hillman 
reaction. During evolution, NMH substantially 
altered the evolutionary trajectory, yielding an 
order of magnitude- more-active variant com- 


Buller et al., Science 382, eadh8615 (2023) 


24 November 2023 


pared with previously engineered enzymes that 
did not include the new catalytic entity (86). 

The installation of tyrosine analogs is a val- 
uable tool for mechanistic investigations of en- 
zymes involving radical intermediates, in which 
such residues may play a catalytic role. Incor- 
porating halogenated tyrosine analogs at key 
positions in the active site of Escherichia coli ribo- 
nucleotide reductase (87) or verrucologen syn- 
thase (88), an Fe(II/a-ketoglutarate-dependent 
dioxygenase, enabled the use of analytical tools 
such as electron magnetic resonance, giving 
key insights into the reaction mechanism. 

Installation of a photosensitizer through the 
use of amber stop-codon suppression created an 
enzyme that catalyzes a photoinduced [2+2]- 
cycloaddition. Upon irradiation of the unnatural 
amino acid 4-benzoylphenylalanine—which was 
installed in the active site of a de novo designed 
Diels-Alderase—triplet energy transfer initiated 
thermally forbidden [2+2]-cycloadditions. Di- 
rected evolution to fine-tune the enzyme yielded 
a catalyst with high enantioselectivity for 
both intra- and intermolecular reactions (89) 
(Fig. 3). 


Computational tools 


Machine learning-based structure-prediction 
tools promise to limit the excessive screening 
or selection necessary to analyze the huge lib- 
raries typically generated during directed evolu- 
tion. The advanced statistics system AlphaFold2, 
developed by DeepMind, predicts protein struc- 
tures from amino acid sequence with much 
greater accuracy than did previous methods. 
AlphaFold2 uses deep neural networks to pre- 
dict inter-residue distances from amino acid 
sequences (90). Its predictions rely on the 
structural data in the Protein Data Bank (PDB) 
(www.wwpdb.org), which contains the three- 
dimensional structures of 200,000 proteins and 
nucleic acids. The neural networks assume that 
substructures that appear frequently in the 
database are more stable than absent or rarely 
occurring substructures. At present, a database 
of predicted structures (https://alphafold.com/) 
contains over 200 million entries and is contin- 
uously growing. Similarly useful alternatives are 
RoseTTAFold (97) and ESMFold (92), created by 
Meta Platforms. 

Structure prediction has the potential to re- 
veal the arrangement of active-site residues 
of the target enzyme, guiding enzyme optimi- 
zation (93). However, enzymatic function also 
requires substrate binding and product release 
and often relies on cofactors or metal ions. 
Tools such as AlphaFill (94) and DiffDock 
(95) project missing organic molecules and 
metal ions into the protein pockets, providing 
a model that may more accurately reflect the 
desired chemistry. To capture motion, Alpha- 
Fold2 can approximate conformational hetero- 
geneity by, for example, reducing the depth of 
the multiple sequence alignments that serve as 
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input for the algorithm (96), as demonstra- 
ted when researchers used AlphaFold2 to create 
models of alternate conformations of a trypto- 
phan synthase (97). 

The deep neural networks trained to predict 
native protein structures from their amino 
acid sequences can be inverted to design new 
proteins (98-101). The first step is to predict 
the structure of a random amino acid sequence, 
which yields parts of a structure with confident 
predictions and other parts with uncertain 
predictions. Repeated modification of the amino 
acid sequence of the uncertain regions even- 
tually leads to a sequence whose entire struc- 
ture is predicted confidently. Testing these 
predictions often yields stable proteins with 
structures close to those predicted. These neu- 
ral network approaches to protein design are 
faster than physics-based methods such as 
Rosetta Match (102), Rosetta Design (103), and 
Rosetta Ligand (104) because they do not at- 
tempt to optimize interactions such as side- 
chain packing. However, the neural network 
approaches lack the physical transparency of 
methods such as Rosetta. Adding diffusion 
models trained on proteins [similar to Stable 
Diffusion, used to generate images (105), or 
ChatGPT, used to generate text] to protein de- 
sign expands the number and variety of de- 
signs generated, thus increasing the chances 
of success (106). Furthermore, language models 
can generate functional protein sequences 
across diverse families (107). 

Although the design of protein structures is 
improving, design of proteins with catalytic 
function remains challenging. Design typically 
yields inefficient enzymes, which then require 
extensive optimization by directed evolution 
(see “Experimental tools” section above). This 
might be because efficient catalysis requires 
more precise positioning of catalytic groups 
than is presently achievable with the algo- 
rithms. Another reason for this problem is an 
incomplete understanding of the structural 
features, including motions, that are needed 
for catalysis. Classical protein-structure de- 
termination with x-ray crystallography uses 
transition-state analogs or suicide substrates 
to reveal how a substrate binds to the active 
site of an enzyme, to learn where binding pock- 
ets are located, and to gather information about 
protein motions that contribute to catalysis. 
For instance, the structural basis for the higher 
catalytic activity of lipases at an oil-water inter- 
face remained a puzzle despite several availa- 
ble x-ray structures of lipases, which showed a 
buried catalytic site. Brzozowski et al. solved 
this puzzle in 1991 with a lipase structure that 
showed a dramatic movement of a helical lid 
covering the catalytic site (108). A more recent 
puzzle was a monoamine oxidase, whose x-ray 
structure also showed a closed conformation 
that could not explain the effects of mutations 
on catalytic activity and substrate scope. In 
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this case, a computational approach using 
long-timescale molecular dynamics identified 
partially and fully open conformations that 
could account for the changes in catalysis (09). 
Current structure-prediction tools do not include 
protein dynamics in most cases and likewise do 
not include information about disulfide bonds 
and posttranslational modifications (which are 
often not covered in protein sequences in data- 
bases) such as glycosylations and acetylations, 
mandating the use of complementary computa- 
tional tools and experiments. 


10-20 fold increased stability 
at both lysosomal and blood 
pH stability 


Connecting protein structures to protein 
functions such as reactivity or selectivity is an 
advancing field in machine learning. Both 
neural networks based on protein structure 
(110) and those based on sequences using con- 
trastive learning have increased their reliabil- 
ity to predict protein function (117). In many 
cases, the application of machine learning to 
catalytic properties of proteins is limited by 
the availability of reliable experimental data to 
train the neural network; recently, the Enzy- 
meML database was established to address 
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this issue (172). In special cases in which the 
properties of tens of thousands of protein var- 
iants have been measured, machine learning 
can predict variants with improved binding 
properties (113, 114). The use of data from 
computational modeling predictions has also 
aided machine-learning approaches to design 
more-selective enzymes (115, 116). 

A further challenge in enzyme design is pre- 
dicting the effects of distant amino acid sub- 
stitutions. Experiments reveal that such residues 
influence catalysis, making their prediction 
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Fig. 5. Selected examples of recent biocatalysis-based products. Examples shown include biotherapeutics [engineered a-galactosidase A (157), insulin 
analogs (124), and oligonucleotides synthesized from TdTs (60)]; potential bulk products [starch synthesized from CQ2 (126) and plastic bottles recycled from 
PET (149)]; pharmaceuticals [molnupiravir (121), ulevostinag (120), islatravir (122), ikarugamycin (123), and an intermediate to BMS-986278 (130, 131)]; and the 


fragrance Ambrofix (127-129). 
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essential to the design of efficient enzymes. 
These residues are too far away to directly in- 
teract with the substrate. They may act through a 
domino-like effect as the protein flexes and 
moves to alter positioning and flexibility of the 
catalytic residues and the substrate(s) within 
the active site, similarly to allostery. One prom- 
ising computational approach to capture the 
effect of relevant distant residues is the short- 
est path map (//7). Starting from a molecular 
dynamics simulation, this approach identi- 
fies residues that move together and are con- 
nected to the catalytic residues. When this 
strategy was applied, distant substitutions abo- 
lished the allosteric activation of tryptophan 
synthase B, activating it permanently (1/8). 


Applications 


Combining several engineered enzymes into a 
cascade creates new biochemical pathways 
(119). Recent examples (Fig. 5) include the syn- 
theses of pharmaceuticals—such as the cyclic 
dinucleotide wlevostinag (120), molnupiravir 
(721), islatravir (122), and ikarugamycin (123)— 
and bioconjugates to make novel insulins (124), 
as well as artificial sweeteners. CO.-fixation 
pathways capable of yielding compounds through 
the CETCH cycle (125), and even starch synthe- 
sis from CO,-derived methanol (126), are further 
examples of complex engineered pathways. 

In the fragrance industry, (-)-Ambrox, which 
has an ambery and woody odor, is one of the 
most widely used biodegradable fragrance in- 
gredients. The previous synthesis of (-)-Ambrox 
(marketed as Ambrofix) was a multistep route 
from the diterpene sclareol isolated from clary 
sage. A new one-step process from biosyn- 
thetically manufactured homofarnesol uses 
an engineered squalene-hopene cyclase (Fig. 
5) (127-129). 

Another example of an enzyme cascade is a 
one-pot, two-enzyme reduction of allylic ketone 
3-oxocyclohexene-1-carboxylate to the corre- 
sponding saturated alcohol reported in early 
2023 by Bristol Myers Squibb (130) and Codexis 
(131). The ene reductase (ERED) and ketoreduc- 
tase (KRED) process showed high enantio- and 
chemoselectivity. Maximum product yields 
benefited from the compatibility of the co- 
factor regeneration system and avoidance of 
side reactions. Separately, substrate inhibition 
and enzyme robustness under process condi- 
tions were improved to deliver a scalable en- 
zyme cascade while greatly reducing the number 
of synthesis steps, improving overall yield, and 
lowering process mass intensity (PMI)—an 
indicator of environmental impact—from a 
value of 2017 for the initial chemical route to 
a PMI of only 170. 

The development of novel enzyme cascades 
has been simplified by the availability of retro- 
synthesis tools—the disconnection approach 
used by organic chemists for decades—to plan 
synthetic routes (132, 133). Programs [e.g., 
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RetroBioCat (134)] identify possible routes 
for the multistep biocatalysis reactions while 
keeping in mind aspects such as commercial 
availability, cofactor requirements, and sol- 
vent tolerance (135-138). Machine learning 
can also assist with this purpose (139). Sim- 
ilarly, metabolic pathway planning predicts 
routes to complex natural products and also 
simpler molecules such as 1,4-butanediol (140). 

Developing drugs for more difficult-to-drug 
targets leads to molecules with increasing num- 
bers of chiral centers, beyond-rule-of-5 scaffolds 
(141), and bioconjugates (142) [e.g., radiother- 
apeutics, antisense oligonucleotide therapeu- 
tics, and antibody-drug conjugates (743)]. To 
address these synthetic (and later manufac- 
turing) challenges requires finding enzymatic 
tools to produce these complex scaffolds. For 
example, it was reported that selective acyla- 
tion of three amino groups of insulin—either of 
the two amino termini or at an internal lysine 
residue—was achieved by using engineered 
penicillin G acylase. These enzyme variants 
enabled installation of a cleavable phenylacet- 
amide protecting group in a programmable 
manner, while leaving one or more amino groups 
unprotected for subsequent chemical modifi- 
cation. The high enzymatic regioselectivity im- 
proved the overall purity and yield of the insulin 
conjugates (724) (Fig. 5). 

Despite the fact that chemical synthesis meth- 
odologies for modified oligonucleotides are 
well established, large-scale manufacturing in 
a sustainable and economically feasible man- 
ner remains challenging (J44). Errors in the 
chemical coupling efficiency accumulate as 
the length of the oligonucleotide increases. 
Several companies and academic laboratories 
have developed technology to enzymatically 
assemble chemically modified oligonucleotides 
(62, 145). The approach uses the well-known 
catalytic activity of RNA ligases to form an 
adenosine triphosphate (ATP)-dependent co- 
valent bond between the 3’-OH and 5'-PO, 
termini of two oligoribonucleotides to form 
one larger, continuous strand (146). A recent 
method describes the use of such an RNA ligase 
to synthesize a chemically modified RNA start- 
ing from short (<9-nucleotide) oligonucleotide 
fragments (147) with 40 to 80% conversion. 

An orthogonal enzymatic approach to oli- 
gonucleotides (62) used self-priming (hairpin- 
like) templates added in catalytic amounts. 
DNA polymerase amplifies the complemen- 
tary sequence in the presence of nucleoside 
triphosphate (NTP) building blocks. A spe- 
cific endonuclease cleaves the newly synthe- 
sized chain and then releases the template 
for the next catalytic cycle. The synthesis uses 
unprotected building blocks in water, without 
large amounts of acetonitrile (as is typically the 
case in solid-phase synthesis), hence addressing 
a major sustainability challenge. Whether the 
unprotected NTP will be possible to source on 
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sufficiently large scales with acceptable time- 
lines and cost to fully embed this concept on 
a manufacturing scale remains to be seen. 
However, given the success of enzymatic syn- 
thesis of unnatural mono- and dinucleosides 
(e.g., ulevostinag, Fig. 5), biocatalytic produc- 
tion can be considered feasible. 

Enzymes can also be used to address plas- 
tic pollution. Since the 1950s, almost 9 billion 
tons of plastics have been manufactured, and 
waste plastics are a major environmental con- 
cern worldwide. Considerable efforts have been 
made to degrade and recycle commodity plastics 
with biocatalytic approaches (748)—especially 
for polyesters, polyamides, and polyurethanes— 
because key chemical bonds can be hydrolyzed 
to yield the corresponding monomers, which 
can be used to make new virgin polymer. The 
most advanced recycling process has been re- 
ported for PET. Rational design methods cre- : 
ated a quadruple mutant of the leaf-branch 
compost cutinase (LCC) that is efficient enough 
to establish a robust process currently imple- 
mented on an industrial scale (149) (Fig. 5). Ina 
more recent example, a PETase from Ideonella 
sakaiensis was engineered by using machine 
learning (150). Present research efforts focus 
on polyamide and polyurethane hydrolyzing 
enzymes; the first candidate enzymes were re- 
cently identified in a metagenome library (157). 


Conclusions 


Advanced tools developed in the past few years 
have sped up protein engineering to the point 
where enzymes are an equal counterpart to 
conventional organic synthesis catalysts, re- 
sulting in an upsurge of biocatalysis applica- 
tions in pharmaceutical manufacture (152) and 
many other areas. Future enzyme engineering 
also needs to speed up discovery-design-test 
cycles to maintain momentum and expand 
synthetic contributions to new enzyme classes. 
Likewise, the combination of enzymes with 
other (catalytic) synthetic chemistry methods 
such as transition metal catalysis, photocat- 
alysis, and electrocatalysis (119, 153, 154) will 
be required to address humankind’s challenges, 
such as combating climate change, degrading 
plastic waste, transitioning to renewable en- 
ergy, and developing new therapies for medical 
treatment. Repurposing of enzyme mechanisms 
to broaden the repertoire of biocatalytic reac- 
tions represents a new concept for creating 
desired activities in enzyme catalysts. Other 
exciting strategies en route to novel enzymatic 
reactivities are de novo enzyme design (54) 
and computer “hallucination” (98). 

Eleven years ago, the third wave of biocat- 
alysis catapulted enzyme technology from 
“designing a biocatalytic process around the 
limitations of the enzyme to engineering the 
enzyme to fit the specifications of the process” 
(1). Now, the technology has taken yet another 
leap: The enzyme engineer no longer has to 
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depend on the enzyme’s native catalytic prow- 
ess to define the accessible chemistry but can 
instead dream up bold new-to-nature reactiv- 
ities and bring these to biocatalytic reality. 
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TISSUE MORPHOGENESIS 


Morphogens enable interacting supracellular phases 
that generate organ architecture 


Sichen Yang{, Karl H. Palmquist}, Levy Nathan, Charlotte R. Pfeifer, Paula J. Schultheiss, 
Anurag Sharma, Lance C. Kam, Pearson W. Miller, Amy E. Shyer*+, Alan R. Rodrigues*+ 


INTRODUCTION: During vertebrate organ mor- 
phogenesis, large collectives of cells robustly 
self-organize to form architectural units (bones, 
villi, follicles) whose form persists into adult- 
hood. Over the past few decades, mechanisms 
of organ morphogenesis have been developed 
predominantly through molecular, genetic, and 
cellular frameworks. More recently, there has 
been a resurgence of interest in collective cell 
and tissue mechanics during organ formation. 
This approach has amplified the need to clar- 
ify and unambiguously link events across bio- 
logical length scales. Doing so may require 
reassessing canonical models that continue to 
guide the field. The most recognized model for 
organ formation centers around morphogens as 
determinants of gene expression and morpho- 
logical patterns. The classical view of a mor- 


A Gene expression space: avian skin dermis 
Margin 


UMAP 2 


UMAP 1 


C Parameter space of supracellular phases 


=, ok less aspiration 
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phogen is that morphogen gradients specify 
differential gene expression in a distinct spatial 
order. Because morphogen expression colocal- 
izes with emerging feather and hair follicles, 
the skin has served as a paradigmatic example 
of such morphogen prepatterning mechanisms. 


RATIONALE: Recent work in the avian skin has 
shown that genes that are thought to establish 
molecular prepatterns are not expressed focally 
before the initiation of feather follicle morphol- 
ogy. Instead, the self-organization of mesen- 
chymal progenitor cells in the dermis initiates 
morphological patterns and activates morpho- 
gen gene expression in emerging follicles. The 
sufficiency of mesenchymal self-organization 
for initiating patterns has been demonstrated 
in ex vivo reconstitution systems in which der- 
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Organ morphogenesis 


Morphogens enable interacting supracellular phases that shape organs. (A) Single-cell gene expression 
profiling of avian dermal cells uncovered two follicle domains (core and margin). UMAP, uniform manifold 
approximation and projection. (B) Morphogens tune supracellular material and mechanical properties. 
Ap, change in pressure. (C) FGF enables a solid-like core, and BMP enables a contractile fluid-like margin. 
(D) A mechanical instability generated between domains with distinct material properties drives organ 


(skin) budding. [Created with BioRender.com] 
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mal cells and extracellular matrix can rec; Chec 
ulate the process of regular follicle pat. 
formation. In this work, we asked, what func- 


tional role do morphogens serve if not to es- 
tablish a pattern of follicles across the skin? 


RESULTS: Working in avian skin tissues, we 
found that the morphogens fibroblast growth 
factor (FGF) and bone morphogenetic protein 
(BMP) act just before the follicle buds out of 
the plane of the skin. Together, these morph- 
ogens enable the generation of nested spatial 
domains within the dermis of the follicle: an 
FGF-active hemispheric core and a surround- 
ing BMP-active margin. In investigating what 
roles these morphogens play in these domains, 
we considered biophysical effects that emerge 
at the supracellular scale. We measured mate- 
rial properties using techniques such as atomic 
force microscopy and micropipette aspira- 
tions and used custom assays to investigate the 
effect of these morphogens on supracellular 
material and mechanical properties. We iden- 
tified morphogen-enabled supracellular material 
property differences (e.g., elasticity) that were 
minimal or lost at cellular scales. Specifically, 
FGF “solidifies” the dermal core, whereas BMP ‘ 
maintains fluidity and increases the mechan- 
ical activity of the margin. We hypothesized 
that the emergence of these two materially 
distinct, adjacent tissue phases produces an ‘ 
instability whereby the active, contractile fluid 
margin propels the inner solidified core into 
the epidermis to induce budding. Indeed, by 
coupling quantitative phase-field modeling 
and experimental data, we showed that these 
two phases create an unstable complex resolved 
through budding. 


CONCLUSION: Our work shows that understand- 
ing the role of morphogens in morphogenesis .« 
requires characterizing emergent material and ‘ 
mechanical properties at the supracellular 
scale. This stands in contrast to the prevailing 
view that the functional effects of morphogens 
are on properties that can be characterized at 
the scale of individual cells, such as concen- 
tration sensing, proliferation, and chemotaxis. 
This paradigm highlights the need to distin- 
guish between the proximal effects of morpho- 
gens, which include modulating gene expression 
of individual cells, and their ultimate functional 
effects, which enable the formation of distinct 
supracellular phases that are capable of mor- 
phological transformation. 
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During vertebrate organogenesis, increases in morphological complexity are tightly coupled to 
morphogen expression. In this work, we studied how morphogens influence self-organizing processes 
at the collective or “supra”-cellular scale in avian skin. We made physical measurements across 
length scales, which revealed morphogen-enabled material property differences that were amplified 

at supracellular scales in comparison to cellular scales. At the supracellular scale, we found that 
fibroblast growth factor (FGF) promoted “solidification” of tissues, whereas bone morphogenetic 
protein (BMP) promoted fluidity and enhanced mechanical activity. Together, these effects created 
basement membrane-less compartments within mesenchymal tissue that were mechanically primed 
to drive avian skin tissue budding. Understanding this multiscale process requires the ability to 
distinguish between proximal effects of morphogens that occur at the cellular scale and their functional 


effects, which emerge at the supracellular scale. 


rgan morphogenesis involves increases 

in morphological complexity, or sym- 

metry breaking, through intrinsic pro- 

cesses of self-organization. Genetic and 

molecular investigation have identified 
conserved sets of proteins, termed morpho- 
gens, that have been shown to be essential for 
the creation of proper tissue morphology (J-3). 
Theoretical models and experimental studies 
have proposed that morphogens initiate mor- 
phological symmetry breaking through chem- 
ical diffusion mechanisms. Resulting spatial 
differences in morphogen concentration cre- 
ate molecular prepatterns that then instruct 
structural changes across a tissue (4-7). By 
contrast, an alternative source of symmetry 
breaking has been proposed by theoretical 
work focused on cell and tissue mechanics, 
whereby mechanical instabilities amid cellular 
collectives can also serve to initiate symmetry 
breaking in a structurally homogeneous tissue 
(8, 9). Given the increasing appreciation that 
organogenesis is an irreducibly multiscale 
phenomenon (J0), a present challenge is to 
clarify and unambiguously link the respective 
roles played by morphogens and collective cell 
mechanics. 
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The avian skin has served as a canonical 
example of how morphogen-mediated prepat- 
terning, which is achieved through putative 
Turing-like reaction-diffusion mechanisms, 
instructs the creation of a repetitive morpho- 
logical pattern (11). However, a prepatterning- 
based role for morphogen-induced symmetry 
breaking was put into question by observa- 
tions that follicle gene expression activation 
occurs at the same time and not before the 
onset of morphological changes (12). Instead, 
the initiation of follicle aggregates has been 
shown to be a direct result of the mechanical 
self-organization of dermal cells. Furthermore, 
this collective cell self-organization also results 
in the activation of gene expression cascades 
within emerging follicles through a mechano- 
transductive signaling pathway, which dispenses 
with the need for a molecular diffusion-based 
mode of establishing gene expression patterns 
across a field of tissue (12). A reconstitution 
system in which primary dermal progenitors 
are cultured on a collagen substrate that 
mimics the in vivo environment has revealed 
that cell-extracellular matrix (ECM) networks 
are essential for collective cell self-organization. 
Moreover, theory coupled to experiment in- 
dicates that the cell-ECM composite creates ef- 
fective material properties at the supracellular 
scale that are analogous to those of a contrac- 
tile fluid. (73). The fluid nature of tissues was 
highlighted in a seminal work several decades 
ago (14). Together, these results emphasize the 
need to consider supracellular material prop- 
erties necessary for tissue symmetry breaking 
that are not reducible to the properties of 
individual cells. 

Although recent studies indicate that mor- 
phogens do not establish prepatterns to in- 
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struct morphological symmetry breaking in the 
avian skin, decades of study have shown that 
numerous established morphogens are essen- 
tial for feather formation and are expressed in 
patterns that coincide with emerging follicles 
(11, 15). Thus, the avian skin offers an ideal 
opportunity to go beyond canonical morpho- 
gen paradigms to uncover unappreciated roles 
for morphogens in shaping tissues. In this work, 
we set out to characterize the role of morpho- 
gens in avian skin during feather follicle 
morphogenesis and their relation to emergent 
supracellular mechanics. 


Characterization of supracellular 
structural motifs that underlie 
follicle emergence 


Given that it has been recently established that 
follicle-specific expression of genes emerges 
just after the follicle primordium initiates (72), 
we set out to perform a spatiotemporal analy- 
sis of supracellular structural changes that oc- 
cur at and after this time stage. To do so, we 
visualized the nuclei, actin, basement mem- 
brane, and epidermal cell membranes in sec- 
tions of dorsal (back) skin from embryonic 
day 6 (E6) to E8 chicken embryos. This char- 
acterization indicated that follicle emergence 
could be subtyped into three periods, which 
we term the precondensation, condensation, 
and budding stages. During the preconden- 
sation stage, the epidermis was a single-cell 
thick cuboidal epithelium (Fig. 1A). Whereas 
the epidermis retained this morphology, the 
basement membrane transitioned to a buckled 
architecture. At the condensation stage, the 
epidermal placode appeared as a thickened 
pseudostratified epithelium, and the basement 
membrane further buckled above an aggregat- 
ing, actin-rich dermal condensation (Fig. 1A). 
The end of the condensation stage was marked 
by a dip of the epidermis and basement mem- 
brane into the dermis, which was associated 
with a reduction of basement membrane buck- 
ling. Notably, up to this time point, which in- 
cluded the precondensation and condensation 
stages, the surface of the skin tissue remained 
flat. After the establishment of the follicle 
primordia, as the follicle gene expression pro- 
gram was being initiated, the epidermis and 
basement membrane inverted from a down- 
ward to an upward curvature (Fig. 1A). After 
this inversion, the epidermis and dermis began 
to protrude or bud out of the plane of the back 
skin. During this budding stage, the laminin 
signal within the basement membrane at the 
center of the bud markedly decreased and ap- 
peared as puncta within the dermal follicle 
condensate (Fig. 1A). 

Although follicle and interfollicle domains 
across the plane of the skin have been previ- 
ously identified (76), our characterization of 
supracellular structure revealed a symmetry- 
breaking event that occurs within the dermal 
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Fig. 1. The emergence of distinct structural domains in the dermis (blue) (top) that correspond to the populations of nuclei used to calculate the 
coincides with feather follicle primordia budding. (A) Developmental time aspect ratio (bottom). N = 30 nuclei per region. (©) Maximum intensity 

course of fixed, longitudinal sections showing the transformation from naive projection showing the composite of nuclei (red), actin (cyan), and laminin 
precondensate skin (Pl and P2) to the condensation of a multicellular aggregate (yellow) at three stages from the corresponding longitudinal sections in (A). 

in the dermis (C1 and C2) to the formation of a mature feather follicle (D) Schematic showing the approximate z-section plane in (E). The dotted line 
primordium that protrudes from the two-dimensional surface of the skin (Bl indicates the approximate z-plane section for images in (E). (E) Maximum 

to B4). Maximum intensity projections of F-actin (phalloidin), nuclei (DAPI), intensity projections of a single feather follicle primordium at three stages 

the basement membrane (laminin), and epithelial cell junctions (E-cadherin) are corresponding to (C) that are stained for fibronectin, F-actin, and nuclei in whole- 
shown. (B) Overlay of example nuclear outlines in the core (orange) and margin mount skin samples. Scale bars are 100 um. ****p < 0.0001. 


condensate as the follicle buds. Specifically, 
two domains emerged within the condensate, 
which we term the core and the margin. These 
domains displayed differences in cytoskeletal, 
nuclear, and ECM arrangement. The core dis- 
played increased actin, round and isotropi- 
cally arranged nuclei (Fig. 1, B to D, and fig. 
S1A), and diminished fibronectin (Fig. IE). By 
contrast, the margin contained elongated and 
aligned nuclei (Fig. 1, B and D, and fig. S1A), 
less actin, and increased fibronectin (Fig. 1E). 
Temporally, these domains emerged during 
the condensation stage and persisted through 
the budding stage, which suggests that a shift 
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in the position of supracellular domains con- 
tributes to budding out of the plane. 


Identification of distinct transcriptomic 
signatures for supracellular 
structural domains 


After uncovering the existence of core and 
margin supracellular domains, we sought to 
determine whether each domain corresponded 
to a distinct gene expression program. We per- 
formed single-cell sequencing on the E8 chick 
back skin, a time point that includes a range of 
precondensation, condensation, and budding 
stages (Fig. 2A). Although the dermis at this 
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stage was composed mainly of fibroblast pro- 
genitors, blood, muscle, and immune fates 
began to be found in the tissue as well (Fig. 
2B). To focus on regional differences within 
dermal fibroblasts, we removed cells with clear 
nonfibroblast fate from our final analysis (fig. 
§2A). Using fluorescence in situ hybridization 
(FISH), we identified a large cluster within the 
fibroblast population as subjacent to the follicle- 
forming dermis (fig. S2B). We then focused 
our analysis on dermal fibroblasts of the super- 
ficial dermis, and clustering analysis revealed 
several distinct clusters (Fig. 2C and fig. $2, C 
and D). Notably, we found clusters expressing 
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markers that correspond to the core and margin 
domains of the follicle dermis (Fig. 2, C to F). Two 
other clusters expressed markers associated 
with more nascent stages before follicle budding 
(Fig. 2, C to F). Thus, as suggested by our mor- 
phological characterization, the dermal conden- 
sate of the follicle is composed of two domains 
with differing gene expression. In addition, 
whereas many single-cell sequencing studies 
map differences between cell types across a 
tissue or through time, our analysis highlights 
that single-cell sequencing can also reveal finer 
distinctions within a single type (e.g., dermal 
progenitor). Focusing on a small yet mor- 
phologically important temporal window, we 
identified distinct clusters that, rather than 
marking divergent final cell fates, indicated the 
tuning of cell properties that enabled marked 
structural differences visible at the supracellu- 
lar scale. 


Fibroblast growth factor and bone morphogenetic 
protein signaling activity delineates distinct 
domains during follicle budding 


We hypothesized that the core and margin 
domains within the budding follicle, which 
differed in their supracellular structural as 
well as their transcriptional profiles, depended 
on differential morphogen activity. Morpho- 
gen expression characterization in previ- 
ous studies has focused on distinguishing 
between follicle and interfollicle expression 
during the condensation and budding stages 
(17, 18). To discover morphogen candidates 
critical for mediating the core and margin 
domains, we focused on morphogens ex- 
pressed at the onset of follicle formation and 
morphogens that appeared in our single-cell 
dataset (15, 19, 20). The intersection of these 
criteria revealed fibroblast growth factor 
(FGF) and bone morphogenetic protein (BMP) 
as the two leading candidates for further 
analysis. FGF and BMP ligands are expressed 
in the epidermis and core dermis (fig. S3A) 
(21, 22); however, we used antibody staining 
of phosphorylated SMAD (pSMAD) and phos- 
phorylated extracellular signal-regulated 
kinase (pDERK) to determine regional pathway 
activity. 

With regard to FGF signaling activity, we 
found that the pERK signal coincided specif- 
ically with the core region of the follicle dermis 
but not the margin (Fig. 2, G to I). However, 
pSMAD transitioned from being localized to 
the condensate to coinciding with the margin 
at budding stages. Thus, at budding stages, 
BMP signaling activity did not overlap with its 
domain of gene expression (Fig. 2, G to I, and 
fig. S3A). We investigated the origins of this 
activity pattern and confirmed that the BMP 
inhibitor follistatin was expressed in the follicle 
epidermis (fig. S3, B and C) (23). The absence 
of pSMAD in the core indicates that follistatin 
inhibited BMP-pathway activity in the region 


Yang et al., Science 382, eadg5579 (2023) 


24 November 2023 


in which BMP was expressed. Thus, this sys- 
tem of molecular activators and an inhibitor 
generates a margin that is exposed to BMP 
ligand that spreads from the secreting core but 
escapes the influence of follistatin, its pathway 
inhibitor. Both of these activity patterns ap- 
peared in discrete domains and were not pre- 
sent in a graded morphogen-type pattern, as 
might have been expected. Thus, FGF and BMP 
are the morphogens that are responsible for 
enabling the generation of distinct, neighbor- 
ing domains before follicle budding. 


Direct biophysical measurements reveal 
emergent elasticity changes at the 
supracellular scale 


To complement our molecular characterization 
of the core and margin regions with biophysical 
characterization, we investigated the material 
properties in these domains by applying atom- 
ic force microscopy (AFM) to frozen sections of 
the skin (Fig. 3A). We performed measure- 
ments using two probe sizes that differed by 
an order of magnitude (5 and 45 pm). Probe 
sizes of 5 um or less are frequently used to mea- 
sure cellular- and subcellular-scale stiffness 
(24). Our inclusion of a less commonly used 
large probe size allowed for the capture of 
properties at the supracellular scale. We found 
that at the supracellular scale (45-um probe), 
the core of the tissue was 2.38-fold stiffer than 
the margin (Fig. 3B). This fold difference in 
stiffness between the core and margin decreased 
to 1.48-fold when stiffness was measured at 
the cellular scale (5 um). This dependence of 
the core-margin stiffness differential on the 
length scale of measurement indicates that 
the supracellular scale can possess emergent 
material properties that are not detectable at 
the cellular-length scale. Such length-scale de- 
pendence of material properties is analogous 
to observations made in granular media (e.g., 
sand) (25). 

Given the spatial correlation between mor- 
phogen activity (core-FGF, margin-BMP) and 
supracellular material property differences (core, 
more stiff; margin, less stiff), we tested a model 
whereby morphogens, through the tuning of 
many cell features, enable emergent tissue 
material properties of adjacent domains. Mor- 
phogens are often characterized as serving a 
role in inducing a new cell fate (e.g., in directed 
differentiation protocols) (26, 27). Further- 
more, when considering effects on morphol- 
ogy, studies have centered around individual 
cell behaviors, in particular, growth and migra- 
tion (7, 18, 28, 29). In this work, however, we 
sought to determine whether the ultimate 
functional effect of morphogens arises on a 
length scale beyond the detection of a single 
cell. This line of inquiry is supported by an 
increasing appreciation for how biophysical 
features of cellular collectives influence mor- 
phogenesis (30-32). 
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To test whether FGF and BMP were suffi- 
cient to affect stiffness at the supracellular 
scale, we performed AFM on ex vivo dermal 
reconstitutions treated with either FGF or BMP 
and compared the results with those from con- 
trol samples. Treatment with morphogens lead 
to a significantly larger change in material prop- 
erties (stiffness) detected at the supracellular 
scale (45-um probe) than at the cellular scale 
(Fig. 3C). To confirm these findings using an 
alternative yet established tool that directly 
measures material properties, we performed 
micropipette aspiration (MPA) (Fig. 3D) on 
dermal reconstitutions treated with either FGF 
or BMP and compared the results with those 
from control samples (33). To perform cellular- 
and supracellular-scale analyses, samples were 
aspirated using pipette tips of 5 to 7 um and 
50 um, respectively. We found significant dif- 
ferences in stiffness in morphogen-treated : 
samples when measurements were made at 
the supracellular scale (50-um probe) but no 
significant differences between treatment and 
control samples when measurements were 
made at the cellular scale (5-um probe); this 
absence of significant differences at the cellu- 
lar scale further corroborates the emergent 
nature of these morphogen-enabled effects. 

Finally, we developed a method of inferring 
material phase property changes at the supra- 
cellular scale based on scanning electron mi- 
croscopy (SEM). We noted that the process of 
drying during preparation for SEM imaging 
generated a pattern of cracks across dermal 
reconstitution samples. These crack patterns 
differed across treatment conditions, with sub- 
stantially larger fractures observed in FGF- 
treated samples than in BMP-treated and control 
samples (fig. S4A). We reasoned that these pat- 
terns of fractures may be akin to the cracks that 
form in many common materials upon drying, 
including sand and concrete. In these cases, 
the material shrinks as it dries to form predic- 
table patterns of cracking, which are indicative 
of the physical properties of the material (34). 
Along these lines, we reasoned that the larger 
cracks that occurred with FGF treatment were 
indicative of increased brittleness or solidity. 
In addition, the absence of large- or medium- 
size fractures in BMP-treated samples indicated 
that they are the least brittle or solid-like. 
Given that the cracking pattern occurs at a 
scale greater than the cell, this assay offered 
anewly discovered opportunity to assess supra- 
cellular brittleness or solidity. 

Taken together, these findings indicate that 
the ultimate functional effect of FGF and BMP 
is to enable material property changes that are 
emergent and therefore only detectible through 
measurements at the supracellular scale. These 
effects on material properties are exerted in 
tissue space, which suggests a functional role 
for morphogens in enabling the formation of 
physically distinct basement membrane-less 
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Fig. 3. Biophysical measurements reveal emergent elasticity changes at the 
supracellular scale. (A) Schematic of experimental approach for AFM analysis of 

in vivo tissue sections. (B) Results from in vivo AFM using cellular-scale (5 um) 

and supracellular-scale (45 um) probes. The ratio of Young's modulus with respect 
to the interbud (Inter.) value is plotted at both length scales. (C) Results from in vitro 
AFM using 5- and 45-uwm probes on dense dermal reconstitutions treated with either 
BMP or FGF or left untreated (control). The Young's modulus ratio is plotted with 
respect to control values. (D) (Top) Example bright-field microscopy images of dense 


extension for control, BMP-treated, and FGF-treated samples. (Bottom) Young's 
modulus of dense dermal reconstitutions for control, BMP, and FGF conditions 
using two different pipette sizes: cellular (5 to 7 wm) and supracellular (50 um). 

N = 16 to 32 measurements for six or seven follicles in (B), 20 measurements 

(each measurement is the average of a stiffness map containing a grid of measurements; 
see Materials and methods) over four independent cultures using the 5-m probe and 
113 to 120 measurements over four independent cultures using the 45-wm probe in 

(C), and 18 to 32 measurements over three independent trials in (D). Scale bar is 


dermal reconstitutions being aspirated into a 50-um glass pipette at maximum 


compartments that coexist and interact within 
a single mesenchymal tissue. 


Integrating experiment and mathematical 
modeling reveals the effects of 
morphogens on supracellular 

viscous properties 


Our direct measurements of stiffness by AFM 
and MPA indicated the presence of emergent 
material properties that relate to elastic or solid- 
phase properties. Given that tissues also con- 
currently possess viscous or fluid-like properties, 
we next sought to characterize viscous proper- 
ties enabled by morphogens. 
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To do so, we developed an assay that shares 
conceptual resonance with those used to study 
molecular condensates (35). Specifically, the 
size and number of aggregates formed in sus- 
pensions of molecules enable the inference of 
material properties across the liquid-to-solid 
phase continuum (36). A greater number of 
smaller aggregates would indicate solid-phase 
properties, whereas a smaller number of larger 
aggregates would indicate fluid-phase proper- 
ties. To adapt such assays to the supracellular 
scale, we cultured a low-density cell suspen- 
sion of primary dermal cells as a hanging drop 


and then characterized patterns of cell con- 
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100 ym. *p < 0.05; **p < 0.005; ***p < 0.001; ****p < 0.0001; nss., not significant. 


densation. Suspensions treated with no signal 
or BMP formed a few large spheroids (Fig. 4, A 
and B, and fig. S5B). Conversely, suspensions 
treated with FGF resolved into many smaller 
spheroids. These results suggest that cells ex- 
posed to FGF signaling generate supracellular 
material properties that are more solid-like than 
those of control and BMP-treated cells. 

To further investigate viscous properties, we 
examined the ability of preformed spheroids 
to merge when placed in close contact. Fluid 
tissues will readily merge, whereas more-solid- 
like tissue will fail to merge (37-39). Indeed, 
FGF-treated spheroids failed to merge, whereas 
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Fig. 4. Morphogens enable distinct viscous properties at the supracellular 
length scale. (A) Spheroids formed in a hanging drop of low-cell concentration 
suspensions. (B) (Top) Area of spheroids formed in (A). (Bottom) Number of 
spheroids per drop formed in (A). N = 8 to 12 drops per condition. (©) Merging 
dynamics of two spheroids initially formed in high-cell concentration suspensions 
for control, FGF, and BMP conditions. (D) Contact length and interspheroid angle 
(in degrees) for conditions shown in (C) measured at the 12-hour time point. 
N = 10 sets of spheroids per condition. (E) Simulations of droplet merger 
informed by AFM and MPA were used to deduce viscous relaxation timescales. 
Full details of these simulations are described in the supplementary text. 


control and BMP-treated spheroids merged 
readily, further supporting the emergence of 
solid properties of an FGF-influenced cell collec- 
tive (Fig. 4, C and D, and fig. S5A). We observed 
hallmarks of stiffness reflected in cytoskeletal 
architecture upon treatment with FGF that 
likely reflected cells in a stiffer environment 
(40-43) and also contributed to tissue stiff- 
ness (fig. S5C). 

To gain a quantitative understanding of 
supracellular viscous properties, we developed 
a model to simulate spheroid merging (Fig. 4, 
E and F). By inputting values of stiffness and 
surface tension gained from MPA measure- 
ments, we were able to generate an estimate 
of the relaxation times of tissues under con- 
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n.s., not significant. 


trol and morphogen treatment conditions. Com- 
bined with our AFM and MPA measurements, 
these results provide a full characterization of 
the bulk material properties of this viscoelastic 
supracellular system. These material property 
estimates can be efficiently summarized in a 
spring-and-dashpot model that captures the 
viscoelastic supracellular responses that are 
made possible by morphogen activity (Fig. 4G). 
Specifically, FGF-exposed tissue has heightened 
elasticity as well as higher viscosity (relaxation 
time) than control tissue. Conversely, BMP- 
exposed tissue has moderate elasticity but re- 
duced viscosity (relaxation time) compared with 
control tissue, which is consistent with our 
cracking assay results. Taken together, these 
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(F) Interspheroid angle (in degrees) for conditions shown in (E) measured at 
the 12-hour time point. (G) Spring-and-dashpot model. Our experiments led us 
to characterize the dermal condensate as a viscoelastic fluid whose passive 
mechanical response can be represented by spring-and-dashpot diagrams. 

We observed changes in the mechanical response in response to morphogen 
treatment, with FGF condensates demonstrating increases in both the 
elastic modulus and the viscous relaxation timescale. BMP spheroids have 
increased stiffness but demonstrate a faster viscous relaxation consistent 
with increased fluidity. Scale bars are 100 um. **p < 0.005; ****p < 0.0001; 


results suggest that BMP-exposed tissues pos- 
sess greater fluidity than control tissues, 
whereas FGF-exposed tissues possess greater 
solidity. 


Morphogens tune supracellular activity 
by enabling changes in ECM architecture and 
traction forces 


In addition to changes in viscoelastic proper- 
ties, it is also possible for a supracellular mate- 
rial to modulate the extent of its “activity,” 
which could serve as a key impulse for mor- 
phological transformation. To investigate this, 
we took advantage of our recently developed 
supracellular behavioral assay, which serves as 
a fluidity assay that assesses activity (13). In 
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Fig. 5. Morphogens tune ECM architecture and traction force to generate 
supracellular contractility. (A) (Left) Bright-field microscopy images from the 
ring assay at the 48-hour time point after treatment with BMP, LDN, FGF, or 
$U5402 (SU), an FGF inhibitor. (Right) Onset of aggregation and aggregate 
number. N = 11 to 28 rings per condition. (B) Average and maximum traction 


forces for individual cells plated on polyacrylamide ge 


condition. (C) (Left) Bright-field microscopy images of disk contraction (50 hours) 


(left) and disk area normalized to control area (right). N = 


this assay, skin fibroblasts form a supracellular 
contractile fluid that resolves into a pattern of 
aggregates. In line with our measurements 
of stiffness and relaxation time, the addition 
of FGF to the behavioral assay interfered with 
aggregate formation (Fig. 5A and movie S1), 
which suggests a transition to a less active, 
solid material. By contrast, treatment with 
BMP increased the speed of aggregate forma- 
tion as well as the number of final aggregates 
(Fig. 5A and movie S1), which prior theoret- 
ical modeling suggested would be the result 
of a more actively contractile fluid (73). Taking 
these results together with our observations 
of in vivo architecture, we hypothesized that 


the key functional output of FGF- and BMP- 
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pathway activity is to create a solid core sur- 
rounded by a contractile fluid margin. 

Our multicellular behavioral assay revealed 
a difference between BMP-treated and control 
cells that indicates enhanced activity mediated 
by contractility (Fig. 5A and movie S1). Thus, we 
investigated whether BMP serves to increase 
contractility by using traction force micros- 
copy (TFM). Freshly harvested dermal progen- 
itors were cultured atop a polyacrylamide 
hydrogel at a low density such that individual 
cells exerted traction in isolation. Embedded 
green fluorescent protein (GFP) beads within 
the hydrogel enabled the tracking and mea- 
surement of gel displacement. When cultures 
were treated with FGF, traction forces remained 
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(D) Immunofluorescent images of nuclei and fibronectin in disks (left) and fibronectin 
intensity normalized to DAPI (right). N = 7 to 10 disk regions per condition. (E) (Left) 
Immunofluorescent images of dense dermal reconstitutions showing nuclei and 
fibronectin (FN) surrounding uncoated agarose beads (asterisk). 
orientation (0° is tangent to bead surface) and the fraction of fibronectin aligned 
around a bead. N = 3 reconstitutions per condition (one to three regions each). 
a.u., arbitrary units. Scale bars are 500 um in (A), 2 mm in (C), 50 um in (D), or 
100 um in (E). *p < 0.05; **p < 0.005; ****p < 0.0001; ns., not significant. 


Right) Fibronectin 


similar to those in control cultures. However, 
BMP treatment increased traction forces ex- 
erted by dermal cells as compared with control 
cells. BMP inhibition with LDN-193189 (LDN) 
compound did not have a significant effect on 
individual cell contractility (Fig. 5B). 

We next sought to determine how individ- 
ual cell changes in traction that are enabled by 
BMP manifest at the supracellular scale. We 
aimed to generate a context that mimicked the 
planar geometry of the bud margin, where 
BMP is active, rather than the spherical shape 
relevant to the core (Fig. 5C). To this end, we 
cultured cells in freed collagen disks and mea- 
sured disk retraction, comparing areas across 
conditions (Fig. 5C). In line with our hypothesis, 
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BMP-treated disks contracted more than control 
disks and LDN-treated disks contracted less than 
control disks (Fig. 5C). The fact that LDN treat- 
ment did not have a significant effect on the 
traction forces of individual cells in isolation 
but did affect supracellular disk contraction in- 
dicates the presence of activity-related processes 
that only appear at the supracellular scale. 

Next, we investigated which cellular fea- 
tures are tuned by BMP to generate increases 
in supracellular contractility. Motivated by our 
observation that the BMP-pSMAD active mar- 
gin shows stronger fibronectin staining (Fig. 
1E), we considered that the increase in col- 
lagen disk contraction could be due to a more 
interconnected ECM network (44, 45). In agree- 
ment with our in vivo observations, BMP- 
treated collagen-disk cultures displayed 
more fibronectin deposition than control 
cultures (Fig. 5D). Conversely, when disks 
were treated with LDN, fibronectin deposition 
was nearly absent (Fig. 5D). These results 
support a role for fibronectin downstream 
of BMP in enhancing collective contraction in 
the fluid-like margin. 

With an understanding of the two distinct 
tissue material properties generated by FGF 
and BMP morphogens, we explored the possi- 
bility that the emerging solid core and con- 
tractile fluid margin may mutually enhance 
each other’s material properties. Cells in the 
BMP-active margin displayed increased elon- 
gation and alignment (Fig. 1), but we did not 
observe similar levels of alignment or elonga- 
tion in dense dermal reconstitutions treated 
with BMP (Fig. 5E and fig. S6, A and B). Noting 
that, in vivo, the margin cells align around a 
more-solid core, we hypothesized that BMP- 
enabled alignment may only occur in the pres- 
ence of a neighboring solid tissue. To test this 
hypothesis, we incorporated agarose beads into 
dense dermal cultures to mimic the round 
solid core. Indeed, BMP-treated cells displayed 
more alignment around the bead than control 
cells (Fig. 5E and fig. S6A). Thus, the material 
properties of the FGF-active core provide a 
solid substrate for the surrounding contract- 
ile fluid layer of the margin to orient around, 
and the core serves as a force guide that 
orients margin contractility. When considering 
a mutually conditioning interaction, we reasoned 
that the oriented contraction of the margin 
likely compresses the core to further solidify it. 
Indeed, cytoskeletal elements, including vimen- 
tin, are up-regulated in compressed fibroblasts 
to protect the nucleus and induce a form of 
strain stiffening (43). Thus, each phase may be 
essential in amplifying the properties of the 
other, further separating the phase domains. 


Interactions between phases generate 
stresses that break morphological symmetry 


Next, we investigated the morphological con- 
sequences of these interacting tissue phases. 
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We hypothesized that a fluid margin contract- 
ing around a solid core could generate forces 
in the dermis sufficient to propel the follicle 
out of the plane of the skin. Such a mechanism 
is distinct from those based on cell migration 
by means of chemotaxis or differential growth 
through proliferation, which focus on individ- 
ual cell behavior rather than emergent supra- 
cellular properties. In line with our hypothesis, 
we saw an effect on budding when tuning 
stiffness and contractility. Specifically, using 
askin explant culture in which the process of 
budding occurs ex vivo, we stiffened skin ex- 
plants through the addition of genipin, a cross- 
linking agent, which decreased or blocked 
budding when low or high concentrations were 
added, respectively (Fig. 6A) (46). Conversely, 
when contractility is decreased in explants, 
which would predominantly affect the con- 
tractile margin, budding is inhibited (72). 

Budding requires transformation of the over- 
lying epidermis. We thus investigated the role 
of the epidermis in this process. To do so, we 
excised tissue at successive follicle stages and 
performed live-tissue dissections during which 
we removed the epidermis. Immediately after 
removal, we fixed the tissue and then charac- 
terized the resulting architecture in tissue sec- 
tions. At the condensation stages, when the 
two tissue phases had yet to form, the dermal 
tissue remained in plane. However, after es- 
tablishing the margin and core domains, the 
follicle dermis protruded upon epidermal re- 
moval (Fig. 6B). These results suggest that in- 
teractions between the core and margin tissue 
phases generate residual stress that has the 
mechanical potential to propel the bud out of 
the plane. Furthermore, these results show that 
the physical presence of the epidermis initially 
resists this action. 

How, then, does the core-margin complex 
act to propel the bud given the presence of 
the overlying epidermis? Guided by our obser- 
vations that the intensity of laminin in the 
basement membrane decreased as the par- 
ticulate staining in the subjacent dermis in- 
creased (as shown in Fig. 1A), we investigated 
whether digestion of the basement membrane 
is acritical step that permits the forces in the 
dermis to generate a bud. In line with this 
model, our single-cell mRNA sequencing results 
revealed a number of matrix metalloproteinase 
(MMP) genes expressed specifically in the core 
of budding follicles (as shown in Fig. 2F). To 
test the idea that an MMP-based digestion of 
the basement membrane is essential for multi- 
phase sculpting of the tissue, we inhibited MMP 
activity in explant culture. Indeed, the base- 
ment membrane in the MMP inhibitor condi- 
tion remained intact and continued to buckle 
rather than flatten (Fig. 6C). Furthermore, when 
we inhibited MMP activity, the bud failed to 
protrude (Fig. 6C). These results demonstrate 
that in addition to the forces generated in the 
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dermis through the interactions of the core and 
margin domains, the weakening of the basement 
membrane is a critical event that is required 
for the follicle to protrude through the plane 
of the skin. 

With an understanding of the properties of 
the components and their putative interac- 
tions, we set out to build a quantitative phys- 
ical model that could predict the organ-level 
consequences of the interacting domains in 
the skin to recapitulate this process. We chose 
a phase-field formalism because it allows for 
explicit representation of multiple material 
phases and easily interpretable governing equa- 
tions. Furthermore, it is well suited for incor- 
porating the distinct active properties of the 
material phases. Limiting the model to two 
dimensions, we describe this process as a free- 
boundary problem, in which different phases 
can deform and exert forces on one another _ 
but have a minimum tendency to mix with 
one another (Fig. 6D and movie 82). The model 
allowed for local manipulation of viscoelasticity, 
geometry, and activity of each material phase 
as well as global manipulation across all phases 
(see the supplementary text). Notably, the model 
is constructed so that all of the bulk material 
properties of each phase can be estimated or 
directly measured from our experiments. 

First, we validated that the physical model 
could recapitulate the morphodynamics of wild- 
type follicle budding based on experimentally 
derived parameters for phase geometry, rela- 
tive values of interfacial energies, and bulk 
mechanical properties (Fig. 6D). To test the 
generalizability of the model, we performed 
in silico perturbations that mimic the exper- 
imental manipulations described above. Indeed, 
our model predicts that increasing the stiff- 
ness and decreasing the contractility of the 
dermis prevents budding in a dosage-dependent 
manner (Fig. 6E, fig. S7A, and movies S3 and 
$4). Furthermore, the removal of the epider- 
mal layer in the model predicts the dermal 
protrusion observed in our epidermal removal 
experiments (Fig. 6F, fig. S7B, and movie S6). 
Next, we tested whether the model recapit- 
ulates the finding that basement membrane 
digestion is needed for budding. To do so, we 
excluded weakening the epidermis from the 
model, and, indeed, the model predicted that 
the skin would fail to bud without this event 
(Fig. 6G, fig. S7A, and movie S5). 

Our model also provided an opportunity to 
test how modulating multiphase tissue prop- 
erties could account for variance in follicle 
budding across the body. To test our theory 
in cases other than the back skin bud, we took 
advantage of the varying bud geometries ob- 
served across the body. When comparing the 
geometry of follicles in the skin of the back 
to that of the follicles in the skin of the head, 
we found that the profile of the head bud is 
more extended and prominent (Fig. 6, H and I, 
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N = 5 follicles per condition. (D) (Left) Phase-field model simulations showing 
the budding process; the contractile active fluid margin is shown in dark blue, 
and the viscoelastic solid core is shown in orange (top). Anisotropic contractility 
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treated according to the viscoelastic model described in the supplementary text. 
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fig. S7C, and movie S7). In considering what 
might cause this shape in our physical model, 
we found that a critical parameter is the geom- 
etry of the domains, with a deeper margin 
leading to an in silico prediction that resem- 
bles the head bud profile. Indeed, we observed 
that the pSMAD region encircles a larger core 
domain (Fig. 61, and fig. S7C). Further, the 
geometry of the more protrusive bud and the 
increased follicle spacing in the head are con- 
sistent with what has been observed pre- 
viously in cases of increased contractility (72). 
Inputting the domain geometry together with 
increased contractility leads to a budding pro- 
file that mimics the head-follicle bud shape 
(Fig. 6, H and I, and fig. S7C). Interestingly, in 
the absence of force guidance provided by the 
solid core domain, it becomes possible for our 
model to achieve a concave rather than con- 
vex outcome (Fig. 6J, fig. S7D, and movie S8), 
which mimics invagination after aggregation 
seen in structures like hair follicles (47). This 
suggests that these multiphase interactions 
may be generalized beyond budding to break 
the tissue plane, either inward or outward. 


Discussion 


Given that tissue symmetry breaking is a multi- 
scale, self-organizing process, it is a challenge 
to identify the functionally salient mechan- 
ical events that lead to increases in morpho- 
logical complexity. Rather than focus on the 
behavior of individual cells, our experimental 
identification of emergent supracellular mate- 
rial properties led us to focus on the mechanics 
that arise between two distinct supracellular 
material phases. Our study shows that phase 
differences in tissues can arise not only in time 
as a tissue matures (31, 48, 49) but also concur- 
rently in space to create a tissue-scale mechan- 
ical instability. 

We show that morphogens are responsible 
for enabling these separated phases. By link- 
ing phase concepts to supracellular mechanics, 
we reveal a role for morphogens in creating a 
morphological order that is distinct from ca- 
nonically proposed roles. Canonical accounts 
of the role of morphogens focus on effects— 
including morphogen concentration sens- 
ing, proliferation, and directed migration or 
chemotaxis—that can be characterized at the 
scale of a single cell (7, 18, 28, 29). By contrast, 
our results support a model in which the effect 
of morphogens on emergent material proper- 
ties at the supracellular scale must be consi- 
dered. Through such consideration, we find 
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that morphogens enable the creation of base- 
ment membrane-less tissue compartments 
within a mesenchymal tissue. 

The creation of distinct phases at the sub- 
cellular scale to serve as membraneless com- 
partments has been broadly noted (35) but 
also questioned with regard to functional rel- 
evance (50). In this work, we show that a clear 
functional role for compartments at the 
supracellular scale is to generate polyphasic 
complexes that, through their geometric ar- 
rangement, are mechanically primed to in- 
crease ordered morphological complexity. 

Our findings indicate that it is necessary to 
distinguish between the proximal effects of 
morphogens, which include modulating gene 
expression within individual cells, and their 
ultimate functional effects, which enable the 
formation of distinct supracellular phases. Our 
study supports a model in which the subtle 
tuning of hundreds of genes at the individual-cell 
scale can coalesce into emergent and discrete 
material and mechanical properties that con- 
tribute to the creation of organ morphologies. 

We hypothesize that morphogen-enabled 
polyphasic supracellular juxtaposition may gen- 
erate tissue architecture in other organs. In 
addition, such processes may be present in 
pathological contexts such as tumors, in which 
cancer associated fibroblasts may tune their 
material properties and subsequent supracel- 
lular mechanical behavior in response to chem- 
ical signals in order to potentiate aberrant 
morphogenesis. 


Materials and methods 
Embryos and dissections 


Embryonic back skins, consisting of one to five 
rows of follicles, were dissected, and the epi- 
dermis and dermis were separated by peeling 
apart the two layers after 15 min in cold 
calcium- and magnesium-free Hanks’ balanced 
salt solution (HBSS). Extracted dermises from 
several embryos were combined and disso- 
ciated in a mixture of trypsin and collagenase 
at 37°C for 10 min. Acellular tissue compo- 
nents were removed in a 40-um filter. 

Fertilized chicken eggs (white leghorn) were 
obtained from commercial sources, incubated 
at 37.8°C, and staged according to Hamburger 
and Hamilton (57). 


Cell and tissue culture assays 
Collagen disk assays 


Dissociated dermal cells from E7.5 back skins 


24 November 2023 


consisting of one to three rows of emerging 
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the MMP inhibitor experiments (G). (H) Phase-field simulations modeling domain 
geometry in the head. (I) Profiles of the core-margin boundary (prebud) 

and bud shape (postbud) for the back (cyan) and head (purple) from both experiment 
and simulation. N = 3 follicles per condition. (J) Phase-field simulations for tissue 

that lacks the core domain at the prebud- and postbud-equivalent stages. For details on 
simulation perturbations, see the supplementary text. Scale bars are 100 um. 


follicles were diluted in full media [10% fetal 
bovine serum (FBS), 2% chick serum (CS)] to 
5000 cells/ul. Collagen-I solution at 3 mg/ml 
was prepared from a 5-mg/ml collagen-I stock 
(bidi, no. 50201) by mixing collagen at 0.6X 
total volume, NaOH at 0.0201X total volume 
for neutralization (for a pH of roughly 7.35), 
and full media. Cells and collagen-I solution 
were then mixed 1:1 to form a final cellular 
concentration of 2500 cells/ul in 1.5-mg/ml 
collagen solution. Disks were prepared by add- 
ing 200 ul of the resulting collagen-cell mix- : 
ture into a non-tissue culture-treated 48-well 
plastic bottomed plate and polymerized for 
1 hour at 37°C. Disks were released from the 
bottom of the plate by moving a small pi- 
pette tip around the polymerized disk. Full 
media, FGF (R&D, 273-F9, 400 ng/ml), BMP 
(R&D, 5020-BP, 100 ng/ml), or LDN (TOCRIS, 
no. 6053, 100 nM) was added to the disks. 
Disks were cultured at 37°C with 5% CO. ina 
humid environment and subsequently imaged 
manually. 


High-cell concentration suspensions for 
spheroid formation and merging 


Dissociated dermal cells from E8 back skins 
consisting of three to five rows of emerging fol- 
licles were diluted in full media to 3000 cells/ul. 
Spheroids were generated by culturing cells in 
5-1 drops upside-down in a 100-mm-by-20-mm 
tissue culture plate at 37°C with 5% CO, ina 
humid environment for 48 hours. Full media, FGF 
(400 ng/ml), or BMP (100 ng/ml) was added 
before plating the 5-ul drops for culturing of 
treated spheroids. Resulting spheroids were 
added into 200 ul of full media with a pipette 
into a 96-well Corning spheroid microplate 
(CLS4515). Spheroid merging was visualized in 
a BioTek Biospa 8 at 37°C with 5% CO, in a 
humid environment (~93% humidity) and were 
automatically transferred for imaging every hour. 


Low cell concentration suspensions for 
spheroid formation 


Dissociated dermal cells from E8 back skins 
consisting of three to five rows of emerging 
follicles were diluted in full media to 1300 cells/ul. 
Cells were cultured in 17-ul drops upside-down 
in a 100-mm-by-20-mm tissue culture-treated 
plate for 48 hours. Full media, FGF (400 ng/ml), 
or BMP (100 ng/ml) was added before plating 
the 17-1 drops for culturing of treated hanging 
drop cultures. Spheroids were cultured at 37°C 
with 5% CO, in a humid environment and sub- 
sequently imaged manually. 
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Ex vivo ring assay 

Dissociated dermal cells from E8 back skins 
consisting of three to five rows of emerging 
follicles were diluted to 2100 cell/l in full media, 
plated on collagen-I that was stored overnight 
at 4°C, and incubated before plating for 30 min 
at 37°C in a drying incubator, as previously re- 
ported (73). Cells were incubated at 37°C for 
1 hour, and then full media, FGF (200 ng/ml), 
SU5402 (TOCRIS, no. 3300, 10 uM), BMP (100 ng/ 
ml), or LDN (1 uM) was added to the well. For 
all long-term imaging experiments, cells plated 
on collagen-I were incubated in a BioTek BioSpa 
8 at 37°C with 5% CO. in a humid environment 
(~93% humidity) and were automatically trans- 
ferred for imaging every hour. 


Explant assay 


Dissected back skins from E6.5 were placed 
dermal-side down on transparent PET mem- 
brane cell culture inserts with a 0.4-um pore 
size (Corning, no. 353090) in six-well plates. 
Inserts with skins were cultured at 37°C with 
5% CO, in a humid environment in six-well 
plates with full media, MMP inhibitor (Sigma, 
Marimastat, no. 444289, 100 uM), genipin 
(Sigma, G4796, 50 uM, 250 uM), or SU5402 
(10 uM). Genipin is a natural plant extract 
that has been shown to cross-link extracellular 
matrix proteins such as collagen and gelatin 
(46), thereby increasing the stiffness of the 
tissue. 


Dense dermal reconstitution assay 


Dissociated dermal cells from E8 back skins 
consisting of three to five rows of emerging 
follicles were resuspended in full media. Eighteen 
microliters of full media were added per dis- 
sociated skin, and the cells were reconstituted 
to make a high-density cell solution. Cells were 
plated in 10-ul drops on cell culture inserts. 
Commercial agarose beads with 75- to 150-um 
diameters were purchased from Bio-Rad (no. 
1537302). Homemade agarose beads were made 
by pipetting 0.6 ul of 0.5% agarose solution 
(made from agarose powder with low gelling 
temperature purchased from Sigma, A9045) 
on a sterile parafilm and letting it polymer- 
ize for 5 min. The homemade agarose beads 
were mixed in the 10-ul drops when indi- 
cated. Inserts with cells were cultured at 37°C 
with 5% CO, in a humid environment in six- 
well plates with full media, FGF (400 ng/ml), 
or BMP (200 ng/ml). 

For all experiments, data include a combi- 
nation of technical replicates that were per- 
formed within a single trial and biological 
replicates performed across multiple trials. 
Data from biological replicates and techni- 
cal replicates were combined if data were con- 
sistent. Individual samples from experimental 
assays were omitted only if initial conditions 
(e.g., drop geometry, ring geometry) were in- 
correct owing to technical error. 
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AFM 

Whole embryos from E8 were dissected, rinsed, 
and immediately transferred to optical cutting 
temperature compound (OCT) on ice. After 
two 30-min washes in OCT, they were trans- 
ferred to embedding molds and snap frozen in 
ethanol that was precooled with dry ice. 
Fourteen-micrometer-thick longitudinal frozen 
sections were obtained by cryosection, trans- 
ferred to glass-bottom petri dishes (FluoroDish, 
FD35), and briefly fixed for 5 min in 4% para- 
formaldehyde for the AFM acquisition (24). 
Acquisition was performed at room tempera- 
ture using a Nanowizard V (JPK-Bruker) mi- 
croscope in QITMadvanced Mode (stiffness 
mapping) or Contact Mode Force Spectros- 
copy mode (single force curve). 

Dense dermal reconstitution cultures treated 
with full media, FGF (400 ng/ml), or BMP 
(200 ng/ml) were seeded on 2-mg/ml collagen 
gel on glass-bottom petri dishes (FluoroDish, 
FD5040) coated with polylysine (Sigma, P8920) 
and cultured for 48 hours for the AFM ac- 
quisition. Acquisition was performed at 37°C 
using an MFP-3D-BIO AFM microscope (Oxford 
Instruments). 

Before each experiment, the exact spring 
constant of the cantilever was determined using 
the thermal noise method, and its optical sen- 
sitivity was determined using a phosphate- 
buffered saline (PBS)-filled glass-bottom Petri 
dish as an infinitely stiff surface. The following 
settings were used: tip Poisson vtip = 0.25, tip 
Young’s modulus Etip = 290 GPa, and sample 
Poisson vsample = 0.45. 

Silicon nitride cantilevers with 5-m-diameter 
spherical tips (Mominal spring constant k = 0.2 N/m, 
Bruker) were used for high-resolution stiffness 
maps. For embryo sections, stiffness maps of 
20 um-by-20 um, 16-by-16 grid points, and 
trigger point 1 nN were collected at 1.5 Hz for 
a single approach-withdraw cycle. One to five 
measurements were taken per region, per fol- 
licle with the following totals: 16 for the core, 
19 for the margin, and 25 for the interbud; 
median values were then obtained. For dense 
dermal reconstitution, stiffness maps of 90 um- 
by-90 um, 10-by-10 grid points and trigger 
point 2 to 3 nN were collected at 1.5 Hz fora 
single approach-withdraw cycle. Four biological 
replicates per condition were measured. For 
each biological replicate, five measurements 
were taken (i.e., 20 data points per condition), 
and average values were obtained. 

A polystyrene particle (45-1m diameter) on 
a silicon nitride cantilever was used (nominal 
spring constant k = 0.35 N/m, Novascan) for 
lower-resolution stiffness curves. For embryo 
sections, a trigger point of 1 nN was used. Seven 
follicles were measured. One to six measure- 
ments were taken per region per follicle with 
the following totals: 22 for the core, 21 for the 
margin, and 32 for the interbud; average values 
were then obtained. For dense dermal recon- 
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stitution, a trigger point of 30 to 50 nN was 
used to ensure sample penetration of 2 to 3 um. 
Four biological replicates per condition were 
measured. For each biological replicate, 6 to 
10 positions were measured. It was measured 
three times for each position, and average values 
were obtained. 

For high-resolution stiffness maps, the force 
curve in each grid point was fitted according 
to the Hertz model (Igor Pro, Wavemetrics) to 
calculate the Young’s modulus. For single force 
curve measurements, the curve was fitted using 
the same model. Force curves that did not meet 
predetermined standards were removed. For 
embryo sections, the Young’s modulus of each 
region was normalized to the average Young’s 
modulus of the interfollicle region within the 
same experiment. For the dense dermal recon- 
stitution culture, the Young’s modulus for 
each condition was normalized to the average : 
Young’s modulus of the control condition 
within the same experiment. 


MPA 


Micropipettes were manufactured as previous- 
ly described (73), except that the pipettes of 
smallest diameter (<25 um) were cut using a 
World Precision Instruments DMF1000 micro- 
forge, whereas the larger pipettes (>25 um) 
were cut using a Sutter Instrument ceramic tile 
for scoring glass (NC9569052, Fisher Scientific). 

The aspiration apparatus and imaging setup 
were also previously described (73). Dense der- 
mal reconstitution cultures treated with full 
media, FGF (400 ng/ml), or BMP (200 ng/ml) 
were each aspirated at various sites using 5- to 
7-um and 50-um diameter pipettes. For the 5- 
to 7-um pipettes, 8 kPa of pressure was applied 
rapidly at ¢ = 0 s, held constant, and released 
rapidly at t = 15 s; images were acquired every 
1s from ¢ = 0 s tot = 30s. In one trial, pressure 
was held (and released) for 30 rather than 15 s. 
For the 50-um pipettes, 4 kPa of pressure was 
applied at ¢ = 0 s, held until ¢ = 90 s, and then 
released. Images were acquired every 950 ms 
for the first 15 s and then every 5 s for the next 
75 s, after pressure application and release. 
Three independent trials (biological replicates) 
with 6 to 10 technical replicates per trial were 
measured for the 5- to 7-um pipette, and three 
independent trials with five to seven technical 
replicates per trial were measured for the 
50-um pipette. Homemade agarose beads were 
aspirated at various sites into a 50-um diameter 
pipette under 8 kPa of pressure, applied for 
30 s, and then released for 30 s, with images 
taken every 1 s. Four beads were measured. 
In every case, the instantaneous strain, creep 
strain, elastic recovery, and anelastic recov- 
ery were captured. 

Image analysis was performed as previously 
described (13). The instantaneous strain (i.e., 
extension of the sample into the pipettes at t ~ 
15 s) was used to calculate Young’s modulus 
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for each condition and pipette size. The whole 
strain curve over time was used to estimate 
tissue viscoelastic properties, as explained in 
greater detail in the supplementary text. 


TFM 


Polyacrylamide (pAA) gels were prepared fol- 
lowing previously described methods (52). Gels 
were coated with 40-ug/ml biotin-labeled fib- 
ronectin for 2 hours at room temperature 
before cell seeding. Cells were seeded on gels 
at a concentration of 670,000 cells/ml in full 
media, Y27632 (TOCRIS, no. 1254, 10 uM), FGF 
(400 ng/ml), BMP (100 ng/ml), or LDN (100 nM). 
The cells were cultured for 18 hours at 37°C with 
5% COz in a humid environment before imaging. 

To estimate the displacement of the beads 
caused by cells exerting force on the pAA gels, 
images of the same region were acquired at 
two time points: while the cells were seeded 
on the gels and after adding trypsin to the cul- 
ture media to detach the cells. Images were 
collected using an Olympus IX-71 fluorescence 
microscope with an Andor iXon3 EM-CCD and 
equipped with a 100x/1.45 NA Plan Apochro- 
mat objective (Olympus). Illumination channel 
488 nm was used for visualization of the bead 
layer, whereas cells were visualized using bright- 
field illumination. MetaMorph for Olympus 
was used to collect images. For each condi- 
tion, four to nine independent cultures (bio- 
logical replicates) with two to six cells per 
culture were imaged. 

The image of beads, while the cells were 
present, represents the deformed state of the 
gels, which is compared with its state after 
the cells have been removed and the gel could 
relax (zero displacement reference). In FIJI, 
the images taken at the two time points were 
aligned using the plug-in “Registration of multi- 
channel timelapse with linear stack alignment 
with SIFT” to correct for relative translational 
xy-shift before analyzing the bead displace- 
ment with particle image velocimetry (PIV). 
PIV enables one to interpolate the particle dis- 
placements into a regularized grid correspond- 
ing to the approximate substrate displacement 
without the need to track individual bead 
movements. PIV analysis was conducted using 
the Matlab toolbox PIVLab (53). The displace- 
ment vector map generated by PIVlab was 
used to calculate the traction forces in Matlab. 
The custom Matlab script implements the un- 
constrained Fourier transform traction cytom- 
etry algorithm described by (54). After the 
stress fields were calculated, the cell boundaries 
of an individual cell were outlined manually. 
Stress vectors inside of the cell boundaries that 
surpassed the background threshold (defined 
as the average stress magnitude of vectors in 
an area without cells) were used to calculate 
the average and maximum stress magnitude 
inside of the cell boundaries. A one-way analy- 
sis of variance (ANOVA) Welch and Brown- 
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Forsythe test was performed to compare the 
mean of each condition with that of the con- 
trol condition. 


SEM 


Cell-culture insert membranes from dense der- 
mal reconstitution assays treated with full me- 
dia, FGF (400 ng/ml), or BMP (200 ng/ml) 
were cut in the shape of round disks of 10-mm 
diameter. Each disk was placed in individual 
well of a cell-culture plate, and the cells were 
fixed in 2% glutaraldehyde, 4% paraformaldehyde 
in 0.1M sodium cacodylate buffer, pH 7.2 for 
2 hours at room temperature followed by over- 
night fixation at 4°C. The cells were then washed 
three times with 0.1-M sodium cacodylate, 
pH 7.2 for 5 min each. They were then treated 
with 1% osmium tetroxide in 0.1-M sodium 
cacodylate buffer, pH 7.2 for 45 min at 4°C. 
After washing with Milli-Q water, cells were 
dehydrated in a graded ethanol series. Dehy- 
dration of cells with 30, 50, and 70% ethanol 
were performed at 4°C, whereas 90 and 100% 
ethanol treatments were performed at room 
temperature. Cells were dried using an Auto- 
samdri-931 (Tousimis, USA) critical point dryer 
(CPD). The detailed CPD protocol was as fol- 
lows: slow fill 3 min, fill time 6 min, purge time 
10 min, postpurge 4 min, critical point 10 min, 
and bleed/vent 300 psi. After CPD, the cells 
were mounted on carbon tape adhered to a 
flat stub. The cells were coated with a 9-nm- 
thick iridium layer using a Leica EM AC600 
sputter coater (Leica, USA). The coated sam- 
ples were imaged with a JEOL JSM-IT500HR 
scanning electron microscope (Jeol, Inc., USA). 
Six independent cultures per condition were 
imaged. 


Immunohistochemistry and imaging 


Skin samples for whole-mount imaging were 
collected at the appropriate embryonic day and 
fixed in 4% paraformaldehyde in PBS at room 
temperature for 1 hour. After fixation, samples 
were rinsed in PBS and PBS-Triton (PBSTrHI, 
1.0% Triton X-100) and then blocked immedi- 
ately for 1 hour in 100% CAS Block (Invitrogen). 

Skin samples shown in cross section were 
from whole embryos at the appropriate em- 
bryonic day fixed in 4% paraformaldehyde in 
PBS overnight at 4°C. Wild-type skins and skins 
with the epidermis removed (shown in Fig. 6) 
were dissected from embryos at the appropri- 
ate embryonic day and immediately fixed in 
4% paraformaldehyde in PBS at room tem- 
perature for 1 hour. They were then embedded 
in OCT. Fourteen-micrometer-thick longitudi- 
nal or transverse frozen sections were obtained 
for immunofluorescence, rinsed in PBS and 
PBS-Triton (PBSTrLO, 0.1% Triton X-100), and 
then blocked immediately for 30 min in 100% 
CAS Block. 

Collagen-cell disks were fixed in 4% para- 
formaldehyde in PBS for 30 min at room tem- 
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perature. Explant cultures were fixed in 4% 
paraformaldehyde in PBS for 1 hour at room 
temperature. Dense dermal reconstitution cul- 
tures were fixed in 4% paraformaldehyde in 
PBS for 30 min at room temperature. After 
fixation, samples were rinsed in PBS and PBS- 
Triton (PBSTrHI, 1.0% Triton X-100) and then 
blocked immediately for 1 hour in 100% CAS 
Block. 

After blocking, all samples were stained in 
10% CAS in PBSTrLO overnight at 4°C. The 
following primary antibodies were used: laminin 
(1:100 sections; ab11575 Abcam), E-cadherin 
(1:100 sections; ab76055 Abcam), fibronectin 
(1:100 wholemount; ab6328 Abcam), pSsMAD1/ 
5/9 (1:300 sections; 1:100 whole-mount; 13820 
Cell Signaling Technologies), and pERK (1:100 
sections; sc-7383 Santa Cruz Biotechnology). 
Samples were subsequently rinsed in PBSTr 
LO and then incubated for 2 hours with the _ 
following secondary antibodies: Alexa Fluor 
488 (1:300, Invitrogen), Alexa Fluor 555 (1:300, 
Invitrogen), and Alexa Fluor 647 (1:300, In- 
vitrogen). 4’,6-Diamidino-2-phenylindole (DAPI) 
(1:1000, Invitrogen) was used to stain nuclei. 
Alexa Fluor 488 and 647 Phalloidin (Fisher 
Scientific, A12379, A22287) were used to stain 
actin. Antigen retrieval was used for pSMAD 
and pERK staining. 

For FISH, skin sections were postfixed in 4% 
paraformaldehyde in PBS, permeabilized through 
an ethanol series (50, 70, and 100%, 5 min 
each) at -20°C, and rinsed in PBS with Tween 
(PBST) (0.1% Tween-20). Probe hybridization 
and amplification were performed using a pro- 
tocol from Molecular Instruments (55). The 
probes for postn, jam3, mmp27, fat4, zfhx3, 
pppiri4b, and fst were designed and produced 
by Molecular Instruments. 

Longitudinal sections shown in Fig. 1, whole- 
mount images of fixed skin explants shown in 
Figs. 1 and 2, and immunofluorescent images 
of collagen-cell disks shown in Fig. 5 were imaged 
using a Zeiss LSM880 and are max-intensity 
projections of two to three z-stacks. All other 
imaging was performed using a Zeiss Axiolmager. 


Analysis and quantification 


Nuclear aspect ratio and density were quanti- 
fied in ImageJ (56). The number of nuclei in 
each respective region was counted and nor- 
malized per 500-um? area. 

Intensity values for the core and margin in 
Fig. 2 were sampled from three regions per 
domain per embryo, and all values are plotted. 
The intensity values in Fig. 2 were normalized 
(background subtraction) to a region outside 
the core and margin domains. To account for 
variation in fibronectin intensity between sam- 
ples in Fig. 5, we first normalized both raw 
fibronectin and DAPI immunofluorescent sig- 
nals for the entire region of interest (ROD to 
the fluorescence intensity in a region within the 
disk with no cells or matrix. We then divided the 
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fibronectin normalized value by the DAPI nor- 
malized value. Together, this reduced variability 
that was caused by immunofluorescence as well 
as by changes in cell density. 

Measurements for aspect ratio, area, circu- 
larity, and average intensity values were per- 
formed in ImageJ. Measurements for contact 
length and interspheroid (intersphere) angle 
were performed as previously described (39). 

Directionality measurements were made using 
the Directionality plug-in in ImageJ. A 100-um- 
by-100-m ROI was analyzed using the Fourier 
components method. For measurements show- 
ing alignment along beads, ROIs were mea- 
sured such that 0° was equal to the line tangent 
to the bead and 90° was radially oriented 
away from the bead. To determine the “frac- 
tion aligned,” measurements from 0° to 30° 
were binned. Directionality measurements were 
plotted using matplotlib in Python. 

All other measurements and statistical analy- 
ses were performed in Prism. Box plots show 
min-max distribution. When calculated, p values 
were determined using a Student’s ¢ test with 
Welch’s correction (except for the TFM results, 
which were specified in the TFM section above). 


Single-cell RNA sequencing 


Dissociated dermal cells from E8 back skins 
consisting of three to five rows of emerging 
follicles were processed using a 10X Genomics 
v3 kit, and libraries were sequenced on Ilumina 
Nextseq 500 system at the Genomics Resource 
Center at the Rockefeller University. Raw se- 
quencing reads were uploaded to the NCBI 
Sequence Read Archive (SRA, PRJNA926968). 

Sequenced reads were processed using Cell- 
ranger version 6.0 (1OX Genomics). The reads 
were mapped to chicken genome (Galgal 6), 
and the counting matrices were generated. 
We identified 7990 cells. The mean read per 
cell was 22,981, and the median number of 
genes in each cell was 967. Ambient RNA 
interferences were estimated and removed 
using SoupX (57) by the default setting. Dou- 
blets and multiple cells in single droplet were 
identified and removed using Scrublet (58). 
Corrected matrices were loaded as a Seurat 
(v4.0) (59) object. Using Seurat, cells with fewer 
than 200 gene expression or with a percentage 
of mitochondrial reads greater than 20% were 
removed; the gene expression of cells was nor- 
malized with scale factor 1000, and the top- 
3000 variable features of each object were 
identified with the “vst” method. 

The remaining 6355 cells were used for di- 
mensional reduction, projection, and cluster- 
ing analysis. Five clusters were identified from 
the initial clustering analysis. Cell type-specific 
gene sets were built based on a single-cell mRNA 
sequencing dataset of human skin (60) to an- 
notate the clusters. On the basis of gene set 
enrichment analysis (GSEA) (67) and the marker 
genes of each cluster, these were identified as 
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fibroblast, blood cell, vascular endothelial cell, 
muscle cell, and immune cell (fig. 2A and data 
S1). The remaining 5829 fibroblasts were re- 
clustered. A population of subjacent dermal 
cells with marker gene ZFHX3 expression was 
identified (fig. S2B). The remaining 4373 super- 
ficial dermal cells were reclustered. A cell cycle 
highly related cluster was identified based on 
GSEA using cell cycle-related gene sets from 
MsigDB (C2-CP gene sets), and the marker 
gene of this population was expressed uni- 
formly in the superficial dermis (fig. S2C and 
data S2). Therefore, we concluded that this 
cluster did not correlate to any cell popula- 
tion with a specific spatial pattern but rather 
to a cell cycle state among all cells in the 
superficial dermis. After removing this cell cycle 
highly related cluster, the remaining 2199 cells 
were reclustered. Cluster stability analysis was 
performed, and the optimal resolution of 0.2 
was used. Four clusters were identified (fig. S2D). 


Simulations 


The full details of the mathematical model, 
including discussion of equations of motion, 
boundary conditions, numerical details, and 
parameter choices, can be found in the sup- 
plementary text. The code related to the math- 
ematical model and simulations has been 
uploaded to Zenodo (62). In summary, we 
adopted a continuum model of the dermal- 
epidermal deformation as an active multiphase 
viscoelastic flow. Our choice of a continuum 
model is motivated by our primary interest in 
understanding the interaction of aggregates of 
cells—by choosing to coarse grain, we can 
more easily study the broad effect of material 
differences between different tissue layers. In 
particular, we identified five relevant regions 
on the basis of our experimental analysis (Fig. 
6): the follicle epidermis (yellow), the core der- 
mal condensate (orange), the condensate mar- 
gin (dark blue), the deep dermal tissue (light 
blue), and the extratissue fluid external to the 
epidermis (black). We treated the matter that 
makes up each region as a distinct material 
phase and examined their mechanical inter- 
actions with a Cahn-Hilliard style phase-field 
model coupled to equations for force balance, 
strain evolution, and other relevant mechan- 
ical fields (63, 64). In our model, the epidermis 
is a relatively stiff viscous solid. By contrast, 
the dermal tissues are assigned substantially 
weaker elastic moduli and feature internal 
relaxation consistent with remodeling of fluid- 
ized tissue. Because our experiments suggest 
that anisotropic contractility of the margin 
dermal cells drives deformation, we assigned 
an active nematic stress to the boundary phase. 
The orientation of this stress is dependent on a 
nematic field anchored to the phase boundaries 
such that it is parallel to the borders between 
dermal regions and perpendicular to the base- 
ment membrane; we based this choice on 
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observed elongation of dermal cells along this 
direction. Finally, we treated extratissue fluid 
as a passive viscous fluid. 

In the interest of simplicity, we assumed 
that the mechanical effect of the basement 
membrane can be lumped into the stiffness 
of the epidermis. To simulate the empirically 
observed basement membrane weakening, we 
approximated the weakening of the basement 
membrane by applying a localized soft spot 
along the interface between the core dermis 
and epidermis, which we found generates con- 
sistent changes in deformation as experimental 
perturbations of the basement membrane. As 
expressed in greater detail in the supplemen- 
tary text, wherever possible, parameters were 
chosen on the basis of our own experiments. 
All simulations were conducted with the Deda- 
lus (v3) (65) spectral framework for partial 
differential equations on a paralleled 512-by-512 
square mesh, with boundary conditions listed 
in the supplementary text. Although the real 
system is three dimensional, we consistently 
found that a planar representation of our mod- 
el generated sufficiently good matches with 
experimental observations. 

Perturbations to simulation parameters were 
used to test the extent to which our phase-field 
model captured the experimental system. Tis- 
sue stiffness (Fig. 6E) was tuned to be two or 
five times that of the baseline condition. Epi- 
dermis removal simulations (Fig. 6F) were 
performed such that the effect of the epider- 
mis phase was removed. For perturbations 
to model the effects of inhibiting MMPs (Fig. 
6G), the simulated effect of weakening the 
core dermis-epidermis boundary was removed. 
To model the effect of contractility in our system 
(fig. S7A), contractility was reduced by 1/2 
or 1/10 in our simulations. Simulations are 
shown as movies and as time points in the 
supplementary text. In addition, simulations 
of the spheroid merger experiments were per- 
formed using the same model but in each case 
featuring only disks that matched the initial 
geometry and material properties of a partic- 
ular dermal phase in our experiments. In each 
case, the spheres were allowed to merge under 
the passive effects of surface tension for 20 hours. 
We were able to estimate the viscoelastic re- 
laxation timescales associated with each dermal 
phase by tuning parameters in each case until a 
close morphological match with experiments 
was determined. 


REFERENCES AND NOTES 


1. W. Driever, C. Nisslein-Volhard, A gradient of bicoid protein in 
Drosophila embryos. Cell 54, 83-93 (1988). doi: 10.1016/ 
0092-8674(88)90182-1; pmid: 3383244 

2. E. Laufer, C. E. Nelson, R. L. Johnson, B. A. Morgan, 

C. Tabin, Sonic hedgehog and Fgf-4 act through a signaling 
cascade and feedback loop to integrate growth and 
patterning of the developing limb bud. Cell 79, 993-1003 
(1994). doi: 10.1016/0092-8674(94)90030-2; 

pmid: 8001146 


13 of 15 


RESEARCH | 


RESEARCH ARTICLE 


3. Y. Chen, A. F. Schier, The zebrafish Nodal signal Squint 26. J. Heemskerk, S. DiNardo, R. Kostriken, P. H. O'Farrell, Multiple 
functions as a morphogen. Nature 411, 607-610 (2001). modes of engrailed regulation in the progression towards cell 
doi: 10.1038/35079121; pmid: 11385578 fate determination. Nature 352, 404-410 (1991). doi: 10.1038/ 

4. A.M. Turing, The chemical basis of morphogenesis. Bull. Math. Biol. 352404a0; pmid: 1861720 
52, 153-197 (1990). doi: 10.1016/S0092-8240(05)80008-4; 27. A. Warmflash, B. Sorre, F. Etoc, E. D. Siggia, A. H. Brivanlou, 
pmid: 2185858 A method to recapitulate early embryonic spatial patterning in 

5. L. Wolpert, Positional information and pattern formation. human embryonic stem cells. Nat. Methods 11, 847-854 
Curr. Top. Dev. Biol. 6, 183-224 (1971). doi: 10.1016/S0070- (2014). doi: 10.1038/nmeth.3016; pmid: 24973948 
2153(08)60641-9; pmid: 4950136 28. M. Weaver, N. R. Dunn, B. L. Hogan, Bmp4 and Fegfl0 play 

6. N. Serrano, P. H. O'Farrell, Limb morphogenesis: opposing roles during lung bud morphogenesis. Development 
Connections between patterning and growth. Curr. Biol. 7, 127, 2695-2704 (2000). doi: 10.1242/dev.127.12.2695; 
R186-R195 (1997). doi: 10.1016/S0960-9822(97)70085-X; pmid: 10821767 
pmid: 9162486 29. B. Bénazéraf et al., A random cell motility gradient downstream 

7. L.C. Biggs et al., Hair follicle dermal condensation forms via of FGF controls elongation of an amniote embryo. Nature 466, 
Fgf20 primed cell cycle exit, cell motility, and aggregation. 248-252 (2010). doi: 10.1038/nature09151; pmid: 20613841 
eLife 7, e36468 (2018). doi: 10.7554/eLife.36468; 30. A. Mongera et al., A fluid-to-solid jamming transition underlies 
pmid: 30063206 vertebrate body axis elongation. Nature 561, 401-405 (2018). 

8. J.D. Murray, G. F. Oster, A. K. Harris, A mechanical model for doi: 10.1038/s41586-018-0479-2; pmid: 30185907 
mesenchymal morphogenesis. J. Math. Biol. 17, 125-129 31. N. |. Petridou, S. Grigolon, G. Salbreux, E. Hannezo, 

(1983). doi: 10.1007/BF00276117; pmid: 6875405 C. P. Heisenberg, Fluidization-mediated tissue spreading by 

9. P. Alberch, “Developmental constraints in evolutionary mitotic cell rounding and non-canonical Wnt signalling. 
processes” in Evolution and Development, J. T. Bonner, Ed., Nat. Cell Biol. 21, 169-178 (2019). doi: 10.1038/s41556-018- 
Dahlem Workshop Report series, vol. 22 (Springer, 1982), 0247-4; pmid: 30559456 
pp. 313-332.doi: 10.1007/978-3-642-45532-2_15 32. J. W. Spurlin Ill et al., Mesenchymal proteases and tissue 

0. P. F. Lenne et al., Roadmap for the multiscale coupling of fluidity remodel the extracellular matrix during airway epithelial 
biochemical and mechanical signals during development. branching in the embryonic avian lung. Development 146, 
Phys. Biol. 18, 041501 (2021). doi: 10.1088/1478-3975/ dev175257 (2019). doi: 10.1242/dev.175257; pmid: 31371376 
abdOdb; pmid: 33276350 33. K. Guevorkian, M. J. Colbert, M. Durth, S. Dufour, 

1. H. S. Jung et al., Local inhibitory action of BMPs and their F. Brochard-Wyart, Aspiration of biological viscoelastic drops. 
relationships with activators in feather formation: Implications Phys. Rev. Lett. 104, 218101 (2010). doi: 10.1103/ 

‘or periodic patterning. Dev. Biol. 196, 11-23 (1998). PhysRevLett.104.218101; pmid: 20867138 
doi: 10.1006/dbio.1998.8850; pmid: 9527877 34. K. A. Shorlin, J. R. de Bruyn, M. Graham, S. W. Morris, 

2. A. E. Shyer et al., Emergent cellular self-organization and Development and geometry of isotropic and directional 
mechanosensation initiate follicle pattern in the avian skin. shrinkage-crack patterns. Phys. Rev. E Stat. Phys. Plasmas 
Science 357, 811-815 (2017). doi: 10.1126/science.aai7868; Fluids Relat. Interdiscip. Topics 61, 6950-6957 (2000). 
pmid: 28705989 doi: 10.1103/PhysRevE.61.6950; pmid: 11088387 

3. K. H. Palmquist et al., Reciprocal cell-ECM dynamics generate 35. Y. Shin, C. P. Brangwynne, Liquid phase condensation in cell 
supracellular fluidity underlying spontaneous follicle physiology and disease. Science 357, eaaf4382 (2017). 
patterning. Cell 185, 1960-1973.e11 (2022). doi: 10.1016/ doi: 10.1126/science.aaf4382; pmid: 28935776 
j.cell.2022.04.023; pmid: 35551765 36. S. Alberti et al., A user's guide for phase separation assays 

4. H. M. Phillips, M. S. Steinberg, Embryonic tissues as with purified proteins. J. Mol. Biol. 430, 4806-4820 (2018). 
elasticoviscous liquids. |. Rapid and slow shape changes in doi: 10.1016/j.jmb.2018.06.038; pmid: 29944854 
centrifuged cell aggregates. J. Cell Sci. 30, 1-20 (1978). 37. K. Jakab et al., Relating cell and tissue mechanics: Implications 
doi: 10.1242/jcs.30.1.1; pmid: 649680 and applications. Dev. Dyn. 237, 2438-2449 (2008). 

5. F. Michon, L. Forest, E. Collomb, J. Demongeot, D. Dhouailly, doi: 10.1002/dvdy.21684; pmid: 18729216 
BMP2 and BMP7 play antagonistic roles in feather induction. 38. S. Douezan, F. Brochard-Wyart, Active diffusion-limited 
Development 135, 2797-2805 (2008). doi: 10.1242/ aggregation of cells. Soft Matter 8, 784-788 (2012). 
dev.018341; pmid: 18635609 doi: 10.1039/C1SM06399E 

6. C. M. Lin, T. X. Jiang, R. B. Widelitz, C. M. Chuong, Molecular 39. M. J. Susienka, B. T. Wilks, J. R. Morgan, Quantifying the 
signaling in feather morphogenesis. Curr. Opin. Cell Biol. 18, kinetics and morphological changes of the fusion of spheroid 
730-741 (2006). doi: 10.1016/).ceb.2006.10.009; pmid: 17049829 building blocks. Biofabrication 8, 045003 (2016). doi: 10.1088/ 

7. C. F. Drew et al., The Edar subfamily in feather placode 1758-5090/8/4/045003; pmid: 27721222 
formation. Dev. Biol. 305, 232-245 (2007). doi: 10.1016/j. 40. J. Zhou, H. Y. Kim, J. H. Wang, L. A. Davidson, Macroscopic 
ydbio.2007.02.011; pmid: 17362907 stiffening of embryonic tissues via microtubules, RhoGEF and the 

8. K. J. Painter, W. Ho, D. J. Headon, A chemotaxis model of assembly of contractile bundles of actomyosin. Development 137, 
feather primordia pattern formation during avian development. 2785-2794 (2010). doi: 10.1242/dev.045997; pmid: 20630946 
J. Theor. Biol. 437, 225-238 (2018). doi: 10.1016/j. Al. D. E. Discher, P. Janmey, Y. L. Wang, Tissue cells feel and 
jtbi.2017.10.026; pmid: 29097151 respond to the stiffness of their substrate. Science 310, 

9. M. Scaal et al., BMPs induce dermal markers and ectopic 1139-1143 (2005). doi: 10.1126/science.1116995; 
feather tracts. Mech. Dev. 110, 51-60 (2002). doi: 10.1016/ pmid: 16293750 
$0925-4773(01)00552-4; pmid: 11744368 42. M. G. Mendez, D. Restle, P. A. Janmey, Vimentin enhances cell 

20. M. Mandler, A. Neubtiser, FGF signaling is required for initiation elastic behavior and protects against compressive stress. Biophys. J. 
of feather placode development. Development 131, 3333-3343 107, 314-323 (2014). doi: 10.1016/j.bpj.2014.04.050; pmid: 25028873 
(2004). doi: 10.1242/dev.01203; pmid: 15201222 43. K. Pogoda et al., Unique role of vimentin networks in 

21. H. Song, Y. Wang, P. F. Goetinck, Fibroblast growth factor compression stiffening of cells and protection of nuclei from 
2 can replace ectodermal signaling for feather development. compressive stress. Nano Lett. 22, 4725-4732 (2022). 

Proc. Natl. Acad. Sci. U.S.A. 93, 10246-10249 (1996). doi: 10.1021/acs.nanolett.2c00736; pmid: 35678828 
doi: 10.1073/pnas.93.19.10246; pmid: 8816784 44, C. E. Caicedo-Carvajal, T. Shinbrot, R. A. Foty, a581 integrin- 

22. S. Noramly, B. A. Morgan, BMPs mediate lateral inhibition at fibronectin interactions specify liquid to solid phase transition 
successive stages in feather tract development. Development of 3D cellular aggregates. PLOS ONE 5, e11830 (2010). 

125, 3775-3787 (1998). doi: 10.1242/dev.125.19.3775; doi: 10.1371/journal.pone.0011830; pmid: 20686611 
pmid: 9729486 45. E. A. Lenselink, Role of fibronectin in normal wound healing. 

23. K. Patel, H. Makarenkova, H. S. Jung, The role of long range, Int. Wound J. 12, 313-316 (2015). doi: 10.1111/iwj.12109; 
local and direct signalling molecules during chick feather bud pmid: 23742140 
development involving the BMPs, follistatin and the Eph 46. H. G. Sundararaghavan et al., Genipin-induced changes in 
receptor tyrosine kinase Eph-A4. Mech. Dev. 86, 51-62 (1999). collagen gels: Correlation of mechanical properties to 
doi: 10.1016/S0925-4773(99)00107-0; pmid: 10446265 fluorescence. J. Biomed. Mater. Res. A 87A, 308-320 (2008). 

24. A. Calo et al., Spatial mapping of the collagen distribution in doi: 10.1002/jbm.a.31715; pmid: 18181104 
human and mouse tissues by force volume atomic force 47. N. Saxena, K. W. Mok, M. Rendl, An updated classification of 
microscopy. Sci. Rep. 10, 15664 (2020). doi: 10.1038/s41598- hair follicle morphogenesis. Exp. Dermatol. 28, 332-344 
020-72564-9; pmid: 32973235 (2019). doi: 10.1111/exd.13913; pmid: 30887615 

25. C. S. Campbell, Granular material flows—an overview. Powder 48. D. Pinheiro, R. Kardos, E. Hannezo, C. P. Heisenberg, 

Technol. 162, 208-229 (2006). doi: 10.1016/j.powtec.2005.12.008 Morphogen gradient orchestrates pattern-preserving tissue 

Yang et al., Science 382, eadg5579 (2023) 24 November 2023 


https://avxhm.se/blogs/hillO 


49. 


50. 


51. 


52. 


53: 


54. 


55. 


56. 


57. 


58. 


59. 


60. 


61. 


62. 


63. 


64. 


65. 


morphogenesis via motility-driven unjamming. Nat. Phys. 18, 
1482-1493 (2022). doi: 10.1038/s41567-022-01787-6 

D. Bi, X. Yang, M. C. Marchetti, M. L. Manning, Motility-driven 
glass and jamming transitions in biological tissues. Phys. Rev. 
X 6, 021011 (2016). doi: 10.1103/PhysRevX.6.021011; 

pmid: 28966874 

D. T. McSwiggen, M. Mir, X. Darzacq, R. Tjian, Evaluating phase 
separation in live cells: Diagnosis, caveats, and functional 
consequences. Genes Dev. 33, 1619-1634 (2019). doi: 10.1101/ 
gad.331520.119; pmid: 31594803 

V. Hamburger, H. L. Hamilton, A series of normal stages in the 
development of the chick embryo. Dev. Dyn. 195, 231-272 
(1992). doi: 10.1002/aja.1001950404; pmid: 1304821 

D. J. Yuan, L. Shi, L. C. Kam, Biphasic response of T cell 
activation to substrate stiffness. Biomaterials 273, 120797 
(2021). doi: 10.1016/j.biomaterials.2021.120797; 

pmid: 33878536 
W. Thielicke, R. Sonntag, Particle Image Velocimetry for 
MATLAB: Accuracy and enhanced algorithms in PlVlab. J. Open 
Res. Softw. 9, 12 (2021). doi: 10.5334/jors.334 

. P. Butler, |. M. Toli¢-Norrelykke, B. Fabry, J. J. Fredberg, 
raction fields, moments, and strain energy that cells exert on 
heir surroundings. Am. J. Physiol. Cell Physiol. 282, C595-C605 
(2002). doi: 10.1152/ajpcell.00270.2001; pmid: 11832345 
H. M. T. Choi et al., Third-generation in situ hybridization chain 
reaction: Multiplexed, quantitative, sensitive, versatile, robust. 
Development 145, dev165753 (2018). doi: 10.1242/dev.165753; 
pmid: 29945988 
C. A. Schneider, W. S. Rasband, K. W. Eliceiri, NIH Image to 
mageJ: 25 years of image analysis. Nat. Methods 9, 671-675 
(2012). doi: 10.1038/nmeth.2089; pmid: 22930834 
M. D. Young, S. Behjati, SoupX removes ambient RNA 
contamination from droplet-based single-cell RNA sequencing 
data. Gigascience 9, giaal51 (2020). doi: 10.1093/gigascience/ 
giaal51; pmid: 33367645 

S. L. Wolock, R. Lopez, A. M. Klein, Scrublet: Computational 
identification of cell doublets in single-cell transcriptomic 
data. Cell Syst. 8, 281-291.e9 (2019). doi: 10.1016/ 
j.cels.2018.11.005; pmid: 30954476 

Y. Hao et al., Integrated analysis of multimodal single-cell 
data. Cell 184, 3573-3587.e29 (2021). doi: 10.1016/ 
j.cell.2021.04.048; pmid: 34062119 

L. Solé-Boldo et al., Single-cell transcriptomes of the human 
skin reveal age-related loss of fibroblast priming. Commun. 
Biol. 3, 188 (2020). doi: 10.1038/s42003-020-0922-4; 

pmid: 32327715 

A. Subramanian et al., Gene set enrichment analysis: A knowledge- 
based approach for interpreting genome-wide expression profiles. 
Proc. Natl. Acad. Sci. U.S.A. 102, 15545-15550 (2005). 

doi: 10.1073/pnas.0506580102; pmid: 16199517 

P. Miller, Code Repository: Morphogens enable interacting 
supracellular phases that generate organ architecture (v0.9). 
Zenodo (2023); https://doi.org/10.5281/zenodo.8348004. 

D. M. Anderson, G. B. McFadden, A. A. Wheeler, Diffuse- 
interface methods in fluid mechanics. Annu. Rev. Fluid Mech. 
30, 139-165 (1998). doi: 10.1146/annurev.fluid.30.1.139 

J. Kim, Phase-field models for multi-component fluid flows. 
Commun. Comput. Phys. 12, 613-661 (2012). doi: 10.4208/ 
cicp.301110.040811a 
K. J. Burns, G. M. Vasil, J. S. Oishi, D. Lecoanet, B. P. Brown, 
Dedalus: A flexible framework for numerical simulations with 
spectral methods. Phys. Rev. Res. 2, 023068 (2020). 

doi: 10.1103/PhysRevResearch.2.023068 


ACKNOWLEDGMENTS 


We thank members of the Laboratory of Morphogenesis at The 
Rockefeller University for discussion and feedback on the 


manuscrip 
manuscrip 


Un 


and L. Marraffini and L. Hoffman for comments on the 
and/or helpful discussions. We thank the Rockefeller 
iversity's shared Bio-Imaging Resource Center for technical 


support and microscope use, the Rockefeller University Genomics 
Core Facility for performing single-cell RNA sequencing, T. Carroll 
and J.-D. Luo of the Rockefeller University Bioinformatics Resource 


Center for 


he processing and initial analysis of single-cell RNA 


sequencing data, A. Pasolli and the Electron Microscopy Resource 


Center for 


Mol 


Center 


echnical support and consultation, and Y. Romin and the 
lecular Cytology Core Facility at Memorial Sloan Kettering Cancer 
or consultation, imaging, and analysis. Funding: This work 


was funded by the Burroughs Wellcome Foundation (A.E.S.), the Irma 


Ti 


Hirsch! Foundation (A.E.S.), the Alfred P. Sloan Foundation, an 
ovate Award (A.E.S.), the Searle Scholars Program (A.E.S.), an 

F National Graduate Research Fellowship (K.H.P.), NIH grant 
AI50244 (L.C.K.), and NIH grant P30 CA008748 (Memorial Sloan 


14 of 15 


& 


RESEARCH | RESEARCH ARTICLE 


Kettering Cancer Center, Molecular Cytology Core Facility). Author 
contributions: P.W.M., A.E.S., and A.R.R. are senior authors. A.R.R. 
and A.E.S. conceived of the project with input from S.Y. and K.H.P. 
AR.R., A.ES., S.Y., and K.H.P. developed assays, designed experiments, 
and interpreted results. S.Y. and K.H.P. performed experiments and 
analyzed data. L.N. assisted with assay development, performed 
experiments, and analyzed data. C.R.P. performed MPA experiments, 
advised on data analysis, and consulted on the data interpretation. 
P.J.S. performed TFM and conducted the analysis with assistance and 
advice from L.C.K. A.S. performed the SEM experiments and imaging. 
P.W.M. developed the theoretical model and simulations. A.R.R. and 
AES. wrote the original draft of the manuscript with input and editing 


Yang et al., Science 382, eadg5579 (2023) 


24 November 2023 


from S.Y., K.H.P., and P.W.M. All authors discussed the results and 
implications and commented on the manuscript at all stages. 
Competing interests: The authors declare that they have no 
competing interests. Data and materials availability: Raw 
sequencing reads were uploaded to the NCBI Sequence Read Archive 
(SRA) under accession number PRJNA926968. All data are available 
in the manuscript and the supplementary materials, and the code 
related to the mathematical model and simulations is available at 
Zenodo (62). License information: Copyright © 2023 the authors, 
some rights reserved; exclusive licensee American Association for the 
Advancement of Science. No claim to original US government works. 
https://www.science.org/about/science-licenses-journal-article-reuse 


https://avxhm.se/blogs/hillO 


SUPPLEMENTARY MATERIALS 


science.org/doi/10.1126/science.adg5579 
Supplementary Text 

Figs. Sl to S7 

References (66-80) 

MDAR Reproducibility Checklist 

Movies S1 to S8 

Data S1 and S2 


Submitted 26 January 2023; resubmitted 18 July 2023 
Accepted 27 September 2023 
10.1126/science.adg5579 


15 of 15 


RESEARCH 


RESEARCH ARTICLE SUMMARY 


EVOLUTIONARY BIOLOGY 


A rugged yet easily navigable fitness landscape 


Andrei Papkou, Lucia Garcia-Pastor, José Antonio Escudero, Andreas Wagner* 


INTRODUCTION: The fitness landscape is a 
foundational concept in evolutionary biology that 
has also served to study complex optimization 
problems in multiple other disciplines. It is an 
analog to a physical landscape in which a loca- 
tion corresponds to a genotype, and the eleva- 
tion at that location corresponds to the fitness 
of an organism with this genotype. Darwinian 
evolution can be viewed as an exploration of such 
a landscape by evolving organisms, in which the 
highest peaks correspond to the best-adapted 
organisms. When Sewall Wright coined the land- 
scape concept in 1932, he was concerned that 
biological fitness landscapes may have an as- 
tronomical number of peaks, most of which may 
have low fitness. In such landscapes, evolving 
populations are likely to become trapped on 
low fitness peaks from which natural selection 
cannot help them escape. For almost 80 years 
after Wright’s discovery, virtually all work on 
landscapes remained theoretical, and even though 
experimental landscape studies are becoming 
more frequent now, we still do not know wheth- 
er rugged landscapes impair adaptive evolution. 


Empirical fitness landscape 
of dihydrofolate reductase. 
We edited the F. coli genome to 
create a fitness landscape 

of all 64° codons encoding 
three consecutive amino 
acids (A, Ala; D, Asp; L, Leu) 
of the protein dihydrofolate 
reductase. We measured 

the fitness of each genotype 
in this landscape in the 
presence of the antibiotic 
trimethoprim using a mass 
selection experiment and 
deep sequencing. Even 
though the landscape is 
highly rugged, adaptive 
evolution can find the highest 
peaks from most starting 
locations via short and 
abundant fitness-increasing 
paths. [Created with 
BioRender.com] 
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RATIONALE: To tackle this fundamental ques- 
tion experimentally, we created a large bio- 
logical fitness landscape (>260,000 mutants) by 
CRISPR-Cas9 gene editing of the key Escherichia 
coli metabolic gene folA, which encodes dihy- 
drofolate reductase. We mapped the fitness 
landscape of this enzyme by exhaustively mu- 
tating nine nucleotides at three amino acid 
positions that can confer resistance to the 
clinical antibiotic trimethoprim. We passaged 
sixfold replicated mutant libraries of all 
folA variants in an antibiotic-containing en- 
vironment and used deep sequencing to ob- 
tain fitness estimates for nearly 99.7% of all 
sequence variants. Our nearly combinatorially 
complete data allowed us to determine the 
ruggedness of this high-dimensional land- 
scape. We identified its fitness peaks, their 
basins of attraction, and evolutionarily accessi- 
ble paths to these peaks. To find out whether 
landscape ruggedness impairs adaptive evolu- 
tion, we simulated the evolutionary dynamics 
on this landscape under various population 
genetics scenarios. 
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Fitness TT 


x. 
RESULTS: We found that the landscape is h oot 
ly rugged. It has 514 fitness peaks, most of w.-— 
have low fitness. Nonetheless, the landscape 
has multiple properties of a smooth landscape. 
These include an abundance of monotonically 
fitness-increasing paths to high fitness peaks, 
large basins of attraction of these peaks, and 
easy reachability of these peaks by >75% of 
evolving populations. Furthermore, most evolv- 
ing populations can access multiple high fit- 
ness peaks. All 74 high fitness peaks effectively 
share one enormous basin of attraction (104,496 
variants). This leads to low predictability of 
evolution on the molecular level because each 
population can take multiple alternative paths 
that lead to different high fitness peaks. High 
fitness peaks remain accessible under various 
evolutionary dynamics on the landscape. 


CONCLUSION: Our work shows that adaptive : 
evolution on realistic high-dimensional and 
rugged fitness landscapes may be easier than 
commonly thought. Our finding calls for new 
and improved theory to understand the counter- 
intuitive geometry of realistic high-dimensional 
fitness landscapes. 
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EVOLUTIONARY BIOLOGY 


A rugged yet easily navigable fitness landscape 


Andrei Papkou’, Lucia Garcia-Pastor’, José Antonio Escudero”, Andreas Wagner’*:4* 


Fitness landscape theory predicts that rugged landscapes with multiple peaks impair Darwinian 
evolution, but experimental evidence is limited. In this study, we used genome editing to map the fitness 
of >260,000 genotypes of the key metabolic enzyme dihydrofolate reductase in the presence of 

the antibiotic trimethoprim, which targets this enzyme. The resulting landscape is highly rugged and 
harbors 514 fitness peaks. However, its highest peaks are accessible to evolving populations via 
abundant fitness-increasing paths. Different peaks share large basins of attraction that render the 
outcome of adaptive evolution highly contingent on chance events. Our work shows that ruggedness 
need not be an obstacle to Darwinian evolution but can reduce its predictability. If true in general, the 
complexity of optimization problems on realistic landscapes may require reappraisal. 


he fitness landscape is a nearly century- 

old foundational concept in evolutionary 
biology (J). Its influence extends to multi- 

ple other disciplines, including ecology 

(2), synthetic biology (3), chemistry (4), 
computer science (5, 6), the social sciences (7), 
and engineering (8). It is an analogy to a phys- 
ical landscape, in which individual spatial loca- 
tions correspond to genotypes, and the elevation 
at each location corresponds to the genotype’s 
fitness. The best-adapted genotypes occupy the 
highest peak(s) of such a landscape. A popu- 
lation evolving by natural selection explores such 
a landscape, and natural selection drives the 
population uphill to the nearest peak (1, 9, 10). 
A fitness landscape can be single-peaked or 
multipeaked (“rugged”) (11). Ruggedness can 
pose a fundamental challenge to evolution’s 
ability to find a landscape’s highest peaks, be- 
cause a population evolving under the influ- 
ence of natural selection can only travel on 
accessible paths through the landscape, that is, 
paths in which each mutational step increases 
fitness (9, 72). The reason is that natural se- 
lection favors high fitness genotypes and does 
not allow a population to traverse low fitness 
valleys between a local peak of intermediate 
fitness and nearby higher fitness peaks (9, 1.3). 
Theory predicts that in highly rugged land- 
scapes, most evolving populations will become 
trapped at local peaks of low fitness (/4). The 
relationship between landscape ruggedness and 
peak accessibility has been studied with var- 
ious theoretical models developed in the 20th 
century (75), such as the NK (J6), “house of 
cards” (17), and “rough Mount Fuji” models (78), 
which may not capture important features of 
empirical landscapes (19, 20). Because most 
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research on adaptive landscapes remained the- 
oretical until recent decades (9, 11), we still 
know little about the ruggedness, and even less 
about the accessibility, of high fitness peaks in 
empirical fitness landscapes. 

Early experimental studies to map empirical 
landscapes measured the fitness of <10* geno- 
types (21-31). Later studies mapped up to 10° 
genotypes that were generated by random 
mutagenesis of a reference (wild-type) geno- 
type. The resulting fitness data are dense around 
the wild type but sparse everywhere else be- 
cause of missing information on double, triple, 
and higher-order mutants (32, 33). In other 
words, such data do not fulfill the important 
requirement of combinatorial completeness, 
which means that the fitness of all allele com- 
binations at the mutagenized loci must be known 
to permit an exhaustive search of evolutionary 
paths. Some studies achieved combinatorial 
completeness by quantifying biochemical prop- 
erties that may be correlated with fitness, such 
as the binding of biological molecules in vitro 
(34-38). Only the most recent works have 
created large and combinatorially complete 
fitness landscapes in microorganisms such as 
Saccharomyces cerevisiae and Escherichia coli. 
However, these works focused on other aspects 
of landscapes, such as the prevalence of higher- 
order epistasis (39, 40), the molecular principles 
underlying genotype-phenotype relationships 
(41), the allosteric effects of mutations (42), 
and the evolution of specificity in protein in- 
teractions (43, 44). 

Here, we address the fundamental question 
of the relationship between ruggedness and 
peak accessibility by mapping a large and 
combinatorially complete in vivo fitness land- 
scape of the E. coli folA gene, which encodes the 
essential metabolic enzyme and antibiotic re- 
sistance protein dihydrofolate reductase (DHFR). 
We find that this landscape is rugged, but its 
ruggedness does not preclude evolving pop- 
ulations from accessing high fitness peaks. 


https://avxhm.se/blogs/hillO 


Experimental design and reproducible 

fitness measurements 

We performed CRISPR-Cas9 (45) deep muta- 
genesis to edit the folA gene on the bacterial 
chromosome, randomizing nine nucleotide 
positions in a part of the gene that is both 
conserved and implicated in the evolution of 
antibiotic resistance (Fig. 1A). The result is a 
combinatorially complete library of almost 4° 
(262,144) DNA genotypes. They include all pos- 
sible (64°) combinations of codons that encode 
three successive amino acids of DHFR (wild- 
type sequence: 26A-27D-28L) (fig. SI). Missense 
mutations at these positions provide high re- 
sistance to the clinically important antibiotic 
trimethoprim, which inhibits DHFR. We selec- 
ted these three positions because they frequent- 
ly acquire resistance mutations in experimental 
evolution (46, 47) and because their proximity 
to each other facilitates gene editing. : 

We exposed a population of F. coli cells ex- 
pressing this library to a sublethal dose of tri- 
methoprim and measured the fitness of library 
members through deep sequencing in a sixfold 
replicated mass-selection experiment. The re- 
sulting data comprise folA variant frequencies 
before and after selection for 99.7% (261,382/ 
262,144, of all possible variants (48). Variant 
frequencies were highly consistent between 
replicates (Pearson’s pairwise correlation co- 
efficient r between replicates: 0.946 < 7 < 0.999; 
fig. S2). We used these frequencies to calculate 
the fitness of all DHFR variants relative to the 
wild type. Population genetic theory shows (49) 
that it is best to represent all fitness values on 
a natural logarithmic scale (48). On this scale, 
the fitness of the wild type has a value of 0, and 
that of a variant with relative fitness 1 corre- 
sponds to an exponential growth rate that is 
e' = 2.718 times higher. 

The fitness values we measured are highly 
consistent with values obtained in a smaller in- 
dependent experiment (N = 250 variants, 
Pearson’s 7 = 0.972, P = 2 x 10°*: fig. $3, A and 
B). Additional experiments confirmed that the 
genetic background of our E. coli strain did not 
substantially alter the relative fitness of our 
DHFR variants (fig. S3, C and D) (48). Finally, 
to validate our fitness estimates, we isolated 
30 DHFR variants and measured their growth 
rate and resistance to trimethoprim in single 
cultures expressing individual DHFR variants. 
We found that relative fitness was highly cor- 
related with the growth rate and resistance ob- 
served in single cultures (figs. S3 to S5; Pearson’s 
r = 0.993, P = 2.3 x 10°?” for growth rate; 
Pearson’s 7 = 0.987, P = 1.1 x 10 *? for resistance). 


Functional DHFR variants vary widely in fitness 
and contain few amino acids at a key position 


Most DHFR variants (93%, 243,303/261,332) 
have very low fitness (Fig. 1B). The distribution 
of their fitness values is consistent with that 
of variants with two stop codons (fig. S6). 
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Fig. 1. Creation of combinatorially complete library and distribution of 
fitness effects. (A) Experimental design. We used CRISPR-Cas9 gene editing to 
create a library of DHFR variants. We targeted a 9-nt segment of the folA 

gene on the E. coli chromosome, which encodes three amino acids of DHFR 
(A26-D27-L28; A, Ala; D, Asp; L, Leu). In the folded protein, these three amino acids 
lie inside the substrate binding pocket (Protein Data Bank ID6XG5). We edited the 
segment with degenerate oligonucleotides to obtain a library comprising 99.7% of 
all theoretically possible DHFR genotypes for this segment. We used this library 

to perform a mass selection experiment by growing six parallel cultures at the half- 
inhibitory concentration of trimethoprim (0.4 ug/ml). We amplified the variable 
DHFR region and performed Illumina paired-end deep sequencing. We determined 


Because DHFR catalytic activity bears a nearly 
linear relationship with E. coli fitness (50, 57), 
we conclude that DHFR is inactive in these 
variants. We thus refer to these variants as 
nonfunctional, to distinguish them from the 
remaining 18,029 functional variants. The func- 
tional variants show highly variable fitness. 
Among them are high fitness variants identi- 
cal to several previously characterized folA mu- 
tants (52) with clinically relevant trimethoprim 
resistance levels that exceed wild-type resistance 
by two orders of magnitude (fig. S7). Almost all 
functional variants have a negatively charged 
aspartic or glutamic acid at position 27 (Asp”” 
or Glu”’; fig. S8A). This amino acid position is 
highly conserved in DHFR (53) and highly prev- 
alent in an alignment of 5000 orthologous 
DHFR proteins from proteobacteria (Fig. 1C). 
Consistent with a previous study that reported 
a functional mutant with a cysteine at position 
27 (Cys”’) (54), we identified many functional 
Cys”’ variants, even though the Cys”” allele is 
absent from the 5000 proteobacterial sequences 
we studied. Despite the importance of position 
27, it does not solely determine the sequence- 
function relationship, which depends on inter- 
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actions between different positions. For example, 
Asp’ and Glu?’ alleles confer high fitness in 
combination with different sets of amino acids 
at positions 26 and 28 (fig. S9). 


The fitness landscape is rugged 


To study the DHFR fitness landscape, we rep- 
resented our data as a network or graph, in 
which each node (vertex) represents a DHFR 
variant and is associated with the variant’s fit- 
ness. Any two DHFR variants that differ at one 
nucleotide position are immediate (one-mutant) 
neighbors connected by an edge, which repre- 
sents a single mutational step. An evolutionary 
path through this network consists of several 
consecutive mutational steps. We restricted 
such paths to functional DHFR variants by 
removing all nonfunctional variants whose 
neighbors comprise only other nonfunctional 
variants. In contrast, we retained nonfunc- 
tional variants that have at least one func- 
tional neighbor in the network, reasoning that 
rare mutations may create a functional variant 
from a nonfunctional neighbor. 

Fifty-two percent of DHFR variants (135,178) 
in this network are contained in the largest 
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sequencing read counts for each variant before and after selection and used them 
to calculate relative fitness. [Created with BioRender.com] (B) The distribution 
of fitness effects in the library. The dashed red line indicates the cutoff value for 
nonfunctional variants (-0.508), the blue solid line marks the fitness of the 
wild-type variant, and the insets show the tail of the distribution near the wild type. 
N = 261,332 variants. The inset indicates that the distribution has a heavy tail 
near the wild type. (C) The frequency of amino acids at positions 26, 27, and 28. 
The panels show the frequency of amino acids in the library before selection 
(left panel) and after selection (middle panel). The right panel shows the 
corresponding amino acid frequencies in DHFRs from proteobacteria, based on 
an alignment of 5000 orthologous folA sequences. 


connected subgraph [or “giant component” 
(55)], which constitutes the fitness landscape 
we analyze. Even though functional variants 
make up only 13% of genotypes (18,019/135,178) 
of this landscape, they form a densely con- 
nected part of the landscape (fig. S8, B and C), 
with as many functional-to-functional edges 
(50%, 161,015/324,044) as nonfunctional-to- 
functional edges. Almost 95% of functional 
variants have the maximal possible number 
of 27 (9 x 3) one-mutant neighbors, whereas 
nonfunctional variants have only one neighbor 
on average (fig. S8, B and C). 

Next, we quantified our principal indicator 
of ruggedness, the number of peaks (9, 17, 14), 
that is, the number of DHFR variants that have 
higher fitness than all their one-mutant neigh- 
bors. We found that the landscape has 514 peaks 
and is thus rugged (Fig. 2A). To compare, a 
maximally rugged (uncorrelated) random NK 
landscape would contain a similar number of 
peaks (J4) (see supplementary text section S1). 
Most of these peaks (408/514) have low fitness, 
meaning that they are less fit than the wild type 
(fitness < 0). Thirty-three peaks have intermediate 
fitness (from 0 to 1) and are enriched with 
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Cys”’ variants. The remaining 73 peaks have 
high fitness (>1) and consist exclusively of Asp” 
and Glu’ variants (Fig. 2, A and B). Only one 
Asp” peak has a fitness of <1, which equals 
0.87. For simplicity, we will refer to all Asp?’ 
and Glu?’ peaks (including this peak) as high 
fitness peaks. 

Because the location of different peaks in a 
landscape may affect their evolutionary acces- 
sibility, we sought to determine whether peaks 
are close together in the landscape. We did 
so by computing the genetic (nucleotide) dis- 
tance between all pairs of the 514 peaks, that 
is, the minimal number of single-nucleotide 
changes that are needed to convert one peak 
into the other. We compared the resulting dis- 
tance distribution with that of all pairs of 514 
variants chosen at random from the landscape. 
Their mean distances are very similar and lie 
within 1% percent of each other (d = 6.67 for 
peaks versus d = 6.61 for random variants, two- 
sided Kolmogorov-Smirnov test D = 0.03, P < 
10-°°8, N = 131,841; Fig. 2C). This pattern also 
persists if we consider amino acid distances 
instead of nucleotide distances (Fig. 2D). Most 
importantly, the pattern extends to the high 
fitness peaks (i.e., Glu?” and Asp”; fig. $10). In 
sum, the DHFR landscape has many fitness 
peaks that are scattered across the landscape 
(fig. S11). 


Fitness peaks are highly accessible 


Considering the ruggedness of the landscape, 
one might expect that any one peak may only 


be accessible from a small fraction of variants. 
To find out, we first examined for each var- 
iant and peak whether accessible paths exist 
from the variant to the peak. Such paths con- 
sist only of beneficial (fitness-increasing) mu- 
tational steps, because fundamental population 
genetic principles dictate that weakly delete- 
rious mutations are unlikely to go to fixation 
in an organism such as E. colt, with its large 
effective population size of 10° individuals (56). 

More specifically, we determined the size of 
each peak’s basin of attraction—the total num- 
ber of variants from which the peak is acces- 
sible. The size of this basin varies considerably 
depending on peak fitness and which amino 
acid is present at the critical position 27 (Fig. 
3A). In our dataset, low fitness peaks generally 
have small basins, with a median size of only 
28 variants (0.02% of all variants). Interme- 
diate fitness peaks (Cys”’) have larger basins, 
with a median size of 7667 variants (5.7%). Most 
notably, high fitness peaks (with Glu?” and 
Asp”’) have very large basins, whose median 
size comprised the majority (69% or 93,597) of 
variants. In general, peaks with higher fit- 
ness had significantly larger basins of attrac- 
tion (Spearman’s p = 0.61, P = 6.8 x 10°’, N= 
514; fig. S12). 

The large basin size of the Asp’ or Glu?” 
peaks indicates their accessibility. However, 
even though accessible paths to any one high 
fitness peak may exist, they may be few com- 
pared with the total number of paths. To ex- 
clude this possibility, we focused on all variants 
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Fig. 2. Fitness peaks. (A) The distribution of fitness estimates and corresponding standard errors for 514 variants 
identified as peaks in the landscape. The solid blue line at y = O shows the fitness of the wild type. Colors 
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within a peak’s basin of attraction and enum- 
erated all shortest paths (accessible and inac- 
cessible) from each variant to the peak. Not 
unexpectedly, the number of total paths in- 
creases exponentially with a variant’s distance 
from a peak (Fig. 3B). Among these paths, the 
fraction of accessible paths is very high at 
modest distances to a peak but decreases with 
increasing path length (Fig. 3C). For example, 
at two, three, four, and five mutational steps 
away from a peak, 86, 62, 39, and 21% of all 
paths are accessible, respectively. At the dis- 
tance of nine mutational steps, only 1% of 
paths remain accessible. 

In a perfectly smooth landscape, the length 
of the shortest accessible path from any one 
variant to a high fitness peak equals the ge- 
netic distance between the variant and the peak. 
In contrast, in a rugged landscape, even the 
shortest accessible path may meander through : 
the landscape and thus be much longer than 
this genetic distance (52). However, despite our 
landscape’s ruggedness, this is not the case. 
Specifically, the mean length of the shortest 
accessible path between a variant and a high 
fitness peak (mean + SD = 6.65 + 1.9 muta- 
tions) is less than one mutation longer than in 
a smooth landscape (6.06 + 1.22 mutations, 
two-sided Kolmogorov-Smirnov test D = 0.15, 
P<10°°°8, N = 6,748,190) (Fig. 3D). Thus, high 
fitness peaks have large basins of attraction 
containing many short and accessible paths. 


Adaptive evolution can easily reach high 
fitness peaks via short paths 


Even though high fitness peaks appear highly 
accessible, their proportion (14%) is small com- 
pared with the majority of low fitness peaks. 
Would selection drive most evolving popula- 
tions to one of the many (408) low fitness 
peaks? To find out, we first simulated adaptive 
evolution in the strong-selection weak-mutation 
regime (48, 57). This choice is motivated by the 
extremely low mutation rate for the 9-nt (nu- 
cleotide) mutational target in the E. coli ge- 
nome [(9 positions) x (2.2 x 107° mutations per 
position per generation)] (58). In this regime, 
adaptive evolution effectively becomes an adap- 
tive walk (48). 

We simulated 10° such walks, each starting 
from a randomly chosen DHFR variant (func- 
tional or nonfunctional). For all immediate 
neighbors that are only one mutational step 
away from this variant, we then calculated the 
fixation probability of the corresponding mu- 
tation. To this end, we used a well-established 
expression derived by Kimura for the proba- 
bility that a new mutation sweeps through a 
population and becomes fixed (48, 59). This 
expression takes into account the fitness dif- 
ference of the mutant to the starting variant, 
as well as the influence of genetic drift, whose 
strength falls with increasing effective popula- 
tion size. Because E. coli has very large populations 
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Fig. 3. High fitness peaks are easily accessible. (A) Basin size of fitness peaks 
depends on the amino acid at position 27 (horizontal axis). Basin size is shown 
both as the number of variants in the basin (left vertical axis) and as the percentage 
of the total number of the variants in the landscape (right vertical axis). (B) The 
landscape contains many paths to high fitness peaks. The vertical axis shows the 
total number of any shortest paths per variant to a high fitness peak. The horizontal 
axis shows the length of the shortest path. Red and blue boxplots summarize the 
number of paths for any shortest paths and accessible shortest paths. Each box 
spans the interquartile range (IQR), each horizontal line inside a box indicates 

the median value, and each whisker extends to the minimum or maximum value 
within a 1.5 IQR interval. The data values beyond the 1.5 IQR interval are not shown. 


fitness peaks. Blue circles indicate the mean proportion of accessible paths at each 
length. N = 4,876,880 variant-peak pairs. (D) Length of shortest accessible paths. 
The blue line shows the length of shortest accessible paths leading from individual 
variants to each fitness peak. The red line shows the distribution for the length 

of any shortest path between corresponding variant-peak pairs (N = 6,748,190). 
(E) Cumulative distribution of fitness values reached by 10° adaptive walks starting 
from randomly selected variants. Dashed vertical line (x = 0): fitness of the wild 
type; dotted vertical line (x = 0.87): fitness of the lowest among all high fitness 
(Asp~’/Glu2”) peaks. (F) Length of adaptive walks. The vertical axis shows the length 
of adaptive walks that reached a high fitness peak. The horizontal axis shows the 
genetic distances between each starting variant and the attained peak. Red line 


N = 5,509,409,778 paths. (C) Proportion of accessible paths depends on path (= 
length. The vertical axis shows the proportion of accessi 


(>10° individuals), most fixation events are 
driven by selection. A population takes each mu- 
tational step with a probability that corresponds 
to this fixation probability. We repeated this 
procedure for every step of an adaptive walk 
until the walk reached a fitness peak. 
Despite the predominance of low fitness 
peaks in our landscape, 76.5% of the adaptive 
walks reached a high fitness peak (Fig. 3E). 
These high fitness peaks exhibited trimeth- 
oprim resistance at least two orders of mag- 
nitude higher than the wild-type resistance 
(fig. S7). Analyzing the evolutionary trajectories 
leading to Asp and Glu peaks, we observed that 
78% of walks began with acquiring a Glu?’ or 
Asp”’ allele, before converging to high fit- 
ness peaks (fig. S13, A and B). This suggests 
that the fitness advantage of Asp’ and Glu” 
alleles, together with strong selection, drives 
populations to high fitness peaks (fig. S14). 
However, additional analysis revealed that 
nearly all accessible paths in our landscape lead 
to high fitness peaks, regardless of the selection 
gradients along these paths (see supplemen- 
tary text section S2). On average, each variant 
has 1200.4 + 1685.8 shortest accessible paths 
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leading to high fitness peaks (mean + SD, N = 
134,662), compared with only 1.9 + 5.7 paths 
to low fitness peaks (fig. S13D). Notably, adap- 
tive walks tend to find peaks via relatively 
short paths, requiring only 5.6 + 2.1 mutational 
steps (mean + SD, N = 765,181) (Fig. 3F). Hence, 
most populations can easily reach high fitness 
peaks via short and abundant paths. 

High fitness peaks are consistently accessi- 
ble to Darwinian evolution across the different 
assumptions we tested in the adaptive walk 
simulations (fig. S15). However, these simu- 
lations overlook population dynamics and dem- 
ographic stochasticity, which could influence 
peak accessibility by introducing clonal compe- 
tition (60) or allowing populations to stochas- 
tically escape fitness peaks (67). To address 
this limitation, we performed individual-based 
simulations at realistic population sizes and 
mutation rates (48, 56, 62). When simulating 
evolution in 2600 populations for 100,000 dis- 
crete generations (fig. S16), we found that 
73.1% of populations reach a high fitness peak 
(fig. S15), with the majority requiring fewer 
than 10,000 generations to do so (median + 
interquartile range = 4870 + 10,552 generations; 
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x): length of the shortest possible path to a high fitness peak, which is given by 
the genetic distance. Elements of boxplots are as in (B) (N = 765,181). 


fig. S17). These populations underwent 5.2 + 1.8 
selective sweeps (mean + SD, N = 2034; fig. 
SI5F), a number that agrees with the mean 
length of adaptive walks (5.6 + 2.1 mutational 
steps). In sum, individual-based simulations 
confirm the high evolutionary accessibility of 
Asp and Glu peaks, despite the competition 
between clones (fig. S17, C to F) and rare cases 
of fitness peak escape (fig. S17, A and B). 


Simultaneous accessibility of peaks leads to 
contingent evolution 


Because high fitness peaks have enormous ba- 
sins of attraction, we hypothesized that var- 
iants must be members of multiple basins. To 
quantify this overlap between different basins, 
we first determined the proportion of variants 
shared by any two basins. We found that Glu?’ 
and Asp~’ peaks shared 90.1 + 6.4% (mean + 
SD, N = 2701) of the variants in their basins 
(Fig. 4A). In contrast, the basins of the Cys?” 
peaks shared a much smaller proportion of their 
variants on average (mean + SD = 22 + 29%, 
N = 903). Furthermore, the basins of Asp and 
Glu peaks shared only 1.6 + 1.6% (mean + 
SD, N = 3182) variants with those of Cys?’ 
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Fig. 4. Overlap of basins of attraction and evolu- 
tionary contingency. (A) The heatmap shows 

the fraction of variants shared by different basins 
of attraction. Each row and each column correspond 
to one peak, and peaks are classified according to 
the amino acid at position 27. A value of one (red) 
means that the basins of the corresponding peaks 
comprise identical sets of variants, and a value 

of zero (yellow) means that two basins do not 
share any variants. The results are presented 

as a symmetric matrix of pairwise fractional overlaps 
between pairs of all 514 peaks. (B) A magnified 
portion of the matrix including only peaks containing 
Asp’, Glu”, and Cys2’. The basins of high fitness 
peaks (Glu*” and Asp’) share more than 90% 

of their variants. (€) Distribution of the number of 
high fitness peaks (Asp*’/Glu*’) that are accessible 
to each variant (N = 135,178). (D) The distribution 
shows the number of high fitness peaks discovered 
by a total of 1000 populations evolving through 
adaptive walks that start from the same variant. 
Data are based on 10° adaptive walks, such that 
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0? walks started from the same variant among 10° randomly chosen starting variants. Adaptive walks starting from 21% (209/1000) variants did not reach any 
bar at x = 0). For the remaining starting variants, adaptive walks reached multiple high fitness peaks. (E) Adaptive walks preferentially 


attain some high fitness peaks. The vertical axis shows the percentage of adaptive walks (out of 1000) that reached the most preferred peaks (horizontal axis). Each 
gray line summarizes the data for 1000 adaptive walks started from the same variant. To reduce visual clutter, gray lines are shown for only 24 randomly selected 


starting variants (24 gray 


peaks (Fig. 4B). In other words, only a small 
proportion of variants had simultaneous ac- 
cess to both Asp””/Glu”” and Cys’ peaks. Overall, 
77% (104,496/135,176) of variants (functional 
and nonfunctional) had access to more than 
one high fitness peak. Notably, 47.5% of all 
variants in the landscape had simultaneous 
access to all 74 high fitness peaks (Fig. 4:C). 

Such simultaneous accessibility of multiple 
peaks can give rise to evolutionary contingency— 
the dependence of a historical process on chance 
events—because adaptive evolution starting 
from the same location can lead to different 
high fitness peaks. To test this hypothesis, we 
simulated 10° additional adaptive walks, such 
that 1000 walks started from each of 1000 ran- 
domly chosen starting variants. Adaptive walks 
originating from the same variant collectively 
reached 31 different high fitness peaks (mean + 
SD = 29.5 + 17.8 peaks, median 31; Fig. 4D), 
confirming that simultaneous peak accessibil- 
ity renders the identity of an attained fitness 
peak contingent on chance events during adap- 
tive evolution. 

During adaptive evolution, not all high fit- 
ness peaks are equally likely to be found by a 
population starting from a particular variant. 
For instance, 36 + 25% (mean + SD) of adap- 
tive walks starting from the same location 
in the landscape reach a single, most com- 
monly attained high fitness peak (Fig. 4E). 
But even walks converging to the same peak 
are likely to use different paths (fig. S18), which 
is another manifestation of contingency in our 
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ines). The black bold line shows the mean percentage for all 1000 starting variants. 


landscape. In sum, the large size and highly 
overlapping basins of attraction of different 
peaks render evolution on our DHFR land- 
scape highly contingent on stochastic events. 


Discussion 


The DHFR fitness landscape that we mapped 
through CRISPR-Cas9 gene editing is highly 
rugged, harboring 514 (mostly low) fitness peaks. 
At the same time, the landscape displays mul- 
tiple properties expected from a smooth land- 
scape, including an abundance of monotonically 
fitness-increasing paths to high fitness peaks, 
enormous basins of attraction of these peaks, 
and easy reachability of these peaks by most 
evolving populations. High peak accessibility 
in arugged landscape contradicts the predictions 
of classical computational models of random 
fitness landscapes, such as the NK landscape 
(16). More biologically realistic models are 
needed to explain our findings. One such model 
requires a trade-off between fitness in the 
presence and absence of antibiotics, which is 
not consistent with our data (fig. S19) (63). The- 
oretical explanations of our observations thus 
remain an exciting task for future work. 

Our metric of ruggedness—the number of 
peaks—correlates with other such metrics 
(9, 11, 64). Among the most widely used is the 
incidence of reciprocal sign epistasis (fig. S20A), 
which refers to a nonadditive interaction be- 
tween mutations, in which a combination of 
two deleterious mutations produces a positive 
fitness effect. This type of epistasis causes non- 
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monotonic fitness changes along mutational paths 
and can create local fitness reduction that sep- 
arates fitness peaks (65). However, reciprocal 
sign epistasis is necessary but not sufficient for 
the existence of multiple peaks (20, 65, 66). In 
our landscape, we found that 12.5% of mutant 
pairs show reciprocal sign epistasis (fig. S20B). 
This value falls within the range of other com- 
binatorially complete landscapes (8 to 22%) 
(40, 64, 67), where peak accessibility has not 
been directly determined. In complex land- 
scapes like these, the relationship between peak 
accessibility and reciprocal sign epistasis may 
also be complex, and simple proxies of rugged- 
ness may be less useful indicators of landscape 
navigability than is commonly assumed. 
Despite the presence of reciprocal sign epis- 
tasis, high fitness peaks remain accessible via 
short and abundant paths (Fig. 3). As a result, 
extradimensional bypasses (38, 68, 69) (i.e., 
indirect and longer paths that detour around a 
local fitness reduction) do not play a major 
role in rendering fitness peaks accessible (sup- 
plementary text section $3 and fig. S20, C and 
D). In our analyses, we initially focused on 
epistatic interactions between two nucleotide 
positions. When exploring epistasis at three or 
more positions (70), we identified the existence 
of such higher-order interactions (fig. S21, A 
to D). However, no more than three orders 
are needed to explain 93% of fitness variation 
in the landscape (fig. S21, E to K), and the 
strongest interactions frequently involve nu- 
cleotide positions within one codon (fig. $21, B 
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and C). Finally, we detected another type of 
epistatic interaction, known as diminishing 
returns epistasis. This epistasis manifests as 
a reduction in fitness gains from beneficial 
mutations as evolving populations approach 
the maximum fitness (fig. $22). Diminishing 
returns epistasis represents a form of global 
epistasis, as it universally affects most muta- 
tions at high fitness levels (77, 72). 

Because the effective population size of E. coli 
exceeds 10° individuals (56), selection is much 
stronger than genetic drift. For any organism 
with a large population size, even subtle differ- 
ences in fitness can be selected upon. For example, 
synonymous mutations can have a measurable 
effect on fitness, because of codon-specific ef- 
fects on mRNA stability, and on the rate and 
fidelity of translation (73-76). We thus consid- 
ered all mutations (including synonymous 
mutations) in our landscape as non-neutral. 
However, we note that any existing technolo- 
gies to measure fitness are limited by measure- 
ment error. Specifically, even in our sixfold 
replicated experiments with high sequencing 
coverage, we had low power to detect fitness 
differences of <5% (fig. S23). Such fitness dif- 
ferences exist between 11% of all mutational 
neighbors in the DHFR landscape. When we 
consider these differences to be effectively 
neutral, high fitness peaks remain accessible 
albeit through longer adaptive walks (fig. S24). 
Similarly, small population size does not sub- 
stantially affect peak accessibility (fig. S24). 
Thus, our main findings are not affected by 
the neutrality assumption. 

The simultaneous accessibility of multiple 
fitness peaks results in evolutionary contin- 
gency at the genotype level, because different 
populations arrive at different high fitness peaks. 
Previous studies highlighted that genetic drift, 
mutational stochasticity, and epistasis can create 
such contingency (52, 77, 78). Our observations 
show that such contingency can even arise when 
drift is negligible. Genotypic contingency does 
not preclude the predictability of phenotypic 
evolution (46, 52, 77, 79, 80), as most populations 
attained high fitness in a given environment. 

However, different genotypes with similar 
fitness in one environment can have different 
fitness in another environment. This creates 
bifurcation points at which evolutionary tra- 
jectories can diverge in a new environment 
(78). Some of our high fitness genotypes illus- 
trate this principle, as they differ in their fitness 
on a chemically modified version of trimeth- 
oprim (87). Whereas DHFR alleles Glu?’ and 
Thr’® provide resistance against this modified 
antibiotic, allele Arg’s does not. All three al- 
leles occur among the high fitness peaks of our 
landscape (Thr” occurs in 15 peaks, Glu”” occurs 
in 34 peaks, and Arg?® occurs in 1 peak). 
Depending on the peak it started from, a pop- 
ulation subject to this modified antibiotic would 
follow different evolutionary trajectories. More- 
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over, deformation of landscapes as a result of 
an environmental change may open up new 
evolutionary paths unavailable in a previous 
environment (62, 82-84). Therefore, given the 
frequent interactions between genotype and 
environment (85), genotypic contingency is 
likely to cause phenotypic divergence (78). 

Our landscape is one of the largest empirical 
landscapes currently available, but it covers 
only a small segment of a single gene, whose 
variation constitutes a tiny fraction of sequence 
space. This is an unavoidable limitation of any 
empirical landscape study. It also makes gen- 
eralization difficult because the choice of muta- 
tional target can affect the properties of a 
reconstructed landscape (86). However, some 
features explaining the high navigability of 
our landscape may also apply to other land- 
scapes. First, Glu’ and Asp”’ peaks essentially 
share one enormous basin of attraction. The 
reason is that glutamate and aspartate are 
physiochemically similar amino acids that are 
encoded by similar codons in the standard 
genetic code. More generally, the genetic code 
has evolved to minimize the effect of muta- 
tions on the physicochemical properties of 
amino acids (87, 88), suggesting that basins 
of attraction shared by adaptive peaks with 
functionally similar amino acids should be 
common in biological landscapes. 

Second, we deliberately targeted a conserved 
gene that encodes a key metabolic enzyme in 
which a single mutation can lead to profound 
fitness and pleiotropic effects (50, 89). Conse- 
quently, only 7% of variants resulted in a func- 
tional enzyme. Despite this constraint, functional 
variants formed a landscape with many fitness- 
increasing paths. If this phenomenon also exists 
in the landscapes of other conserved genes, it 
may help explain how pathogenic bacteria can 
evolve antibiotic resistance by altering essen- 
tial protein targets of antibiotics (e.g., RNA 
polymerase, DNA gyrase, topoisomerase IV, 
and dihydropteroate synthase) (90). A landscape 
derived from a less-conserved gene would have 
harbored many more functional variants and 
thus potentially even more paths accessible to 
Darwinian evolution than we observed. In con- 
trast, fitness landscapes in which amino acid 
positions have even stronger functional inter- 
actions (such as an enzyme’s catalytic triad) 
might be evolutionarily more constrained. 
Future experiments using different mutation- 
al targets, organisms, and sampling design will 
show whether rugged yet highly navigable land- 
scapes are typical or unusual. 

When Sewall Wright coined the concept of 
an adaptive landscape nearly a century ago, he 
was concerned that multipeaked landscapes 
may prevent adaptive Darwinian evolution 
driven by natural selection (7). Simple theoret- 
ical models developed in the 20th century sup- 
port this concern (74, 16), but experimental 
evidence has been lacking. The landscape we 
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studied shows that ruggedness need not im- 
pair Darwinian adaptation, even though it creates 
an enormous potential for contingent evolu- 
tion. Our results suggest that we will need to 
refine our current theoretical understanding 
of the relationship between landscape rugged- 
ness and navigability. Improved landscape the- 
ory will have to capture realistic properties of 
empirical landscapes, such as the sparsity of 
high-order epistasis (19), the adaptational trade- 
offs between different fitness components 
(63), strong local correlation in fitness among 
genotypes (41), and dense connectivity of se- 
quence space (38, 69, 91). Because applica- 
tions of landscapes extend to different fields 
(4-8), including ecology (2), synthetic biology 
(3), and biomedicine (92), better data and the- 
ory on complex landscapes may also require 
other fields to reevaluate the challenges of 
optimization problems on their landscapes. 
Even though a small fraction of peaks may have 
high fitness, in landscapes like that of DHFR, 
these peaks can easily be discovered by blind 
Darwinian evolution. 


Materials and methods summary 


Full materials and methods are provided in the 
supplementary materials (48). In brief, we edited 
a 9-nt segment of the E. coli chromosomal gene 
folA using the no-SCAR protocol (45). To en- 
able deep mutagenesis, we had to solve two 
problems. First, to allow cells with nonfunc- 
tional DHFR variants to grow, we integrated 
an inducible dfrB9 gene, which encodes an 
alternative dihydrofolate reductase (unrelated 
to E. coli DHFR), into the chromosome (figs. 
$25 to $28). Second, to overcome a reduction 
in the efficiency of gene editing caused by mis- 
match DNA repair, we used a DNA repair- 
deficient strain of the E. coli K12 MG1655 (figs. 
$25 and $29). During gene editing, we trans- 
formed cells with a degenerate oligonucleotide 
encoding all possible codon combinations for 
positions 26, 27, and 28 of DHFR. We induced 
the expression of the guide RNA and of Cas9 
and recovered cells by activating dfrB9 expres- 
sion. We stored the resulting DHFR mutant 
library at -70°C. 

To measure fitness, we recovered the mu- 
tant library for 9 hours in M9-medium (0.4% 
glucose, 0.2% casamino acids) in the absence 
of dfrB9 expression (fig. S30). Next, we inoc- 
ulated 5 x 10° cells into fresh M9 medium 
(0.4% glucose, 0.2% casamino acids) contain- 
ing 0.4 wg/ml of trimethoprim and incubated 
the resulting culture for 14 hours at 225 rpm 
and 30°C (fig. S31). We performed selection in 
six parallel replicate cultures. We isolated DNA 
from all replicates before and after trimethoprim 
selection and used it in a polymerase chain 
reaction to amplify a 214—base pair (bp) DNA 
fragment that spans the 9-nt mutated genome 
locus. We used a commercial sequencing ser- 
vice (150 bp paired-end Illumina NovaSeq 
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6000 at Eurofins Genomics, Germany) and 
implemented a custom analysis pipeline to 
count DHFR variants using a conservative de- 
tection threshold (=100 complete read pairs 
combined in all replicates before selection; fig. 
$32). We applied generalized linear regression 
to estimate the selection coefficient for each 
detected variant relative to the wild type (48). 


We used fitness data and genetic distance 


information to reconstruct the DHFR fitness 
landscape as a graph (network), where nodes 
(vertices) represent variants and edges (links) 
represent single-nucleotide substitutions. In 
general, we only considered fitness-increasing 
edges accessible to Darwinian evolution. We 
determined fitness peaks (variants with only 
incoming edges), evolutionarily accessible paths 
(sequences of strictly fitness-increasing edges), 
basins of attraction (sets of all variants with 
evolutionarily accessible paths to a given peak), 
and pairwise overlaps among basins of attrac- 
tion. For all the above analyses, we used the 
igraph v.1.3.4 library in R v.4.1.3 (93). 


To study adaptive evolution in our land- 


scape, we performed numerical simulations 
assuming strong selection and weak mutation, 
resulting in populations being genetically mo- 
nomorphic most of the time (occupying a 
single genotype in the landscape), which al- 
lowed us to model evolution as an adaptive 
walk on the landscape (57). For each such walk, 
we chose as a starting genotype a random 
nonpeak variant in the graph. We then used 
Kimura’s fixation probability (48, 59, 94) to 
stochastically draw a next variant among the 
genotype’s one-mutation neighbors and repeated 
this process for, at most, 50 mutational steps 
(fixation events) or until the random walk had 
reached a fitness peak. In other random walk 
simulations, we either always selected the fit- 
test neighbor (greedy walks), or we assumed 
a uniform fixation probability for all fitness- 
increasing mutations. We performed a total of 
10° stochastic simulations, except for the deter- 
ministic greedy random walk, where we per- 
formed 134,664 such simulations using custom 
R code and the igraph library (48, 93). More- 
over, we performed 2600 individual-based 
simulations of haploid asexually reproducing 
populations for 100,000 generations using the 
simulation platform simuPop (48, 95). 
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INTRODUCTION: Systematic mining of sequenc- 
ing databases is a powerful method for dis- 
covering protein families and functional systems. 
This approach has uncovered diverse CRISPR- 
Cas systems, which are microbial RNA-guided 
adaptive immune systems that have served as the 
basis of several molecular technologies, notably 
programmable genome editing. However, exist- 
ing methods for sequence mining lag behind the 
exponentially growing databases that now con- 
tain billions of proteins, which restricts the dis- 
covery of rare protein families and associations. 


RATIONALE: We sought to comprehensively 
enumerate CRISPR-linked gene modules in all 
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existing publicly available sequencing data. 
Recently, several previously unknown bio- 
chemical activities have been linked to pro- 
grammable nucleic acid recognition by CRISPR 
systems, including transposition and protease 
activity. We reasoned that many more diverse 
enzymatic activities may be associated with 
CRISPR systems, many of which could be of low 
abundance in existing sequence databases. 


RESULTS: We developed fast locality-sensitive 
hashing-based clustering (FLSHclust), a par- 
allelized, deep clustering algorithm with line- 
arithmic scaling based on locality-sensitive 
hashing. FLSHclust approaches MMseqs2, a 
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Identification and characterization of previously unreported CRISPR-Cas systems. (A) Schematic of 
FLSHclust algorithm. (B) Applications of protein clustering in CRISPR discovery. CARF, CRISPR-associated 
Rossmann fold. (C) Locus diagrams of three newly identified CRISPR-Cas systems experimentally 
characterized in this work. (D) Small RNA sequencing of candidate type VII Cas7-Cas5 ribonucleoprotein 
(RNP) (top), and targeted RNA cleavage by candidate type VIl CRISPR-Cas system (bottom). DR, direct 
repeat; nt, nucleotide; bp, base pair; TBE, tris-boric acid—EDTA buffer. 
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clustering performance. We applied FLSH lesan 
in a sensitive CRISPR discovery pipeline and 
identified 188 previously unreported CRISPR- 
associated systems, including many rare systems. 
We experimentally characterized four of the 
newly discovered systems. We examined a 
type IV system with an HNH nuclease domain 
inserted in the CRISPR-associated DNA damage- 
inducible gene G (DinG)-like helicase. We 
found that this system exhibited RNA-guided 
protospacer-adjacent motif (PAM)-dependent 
directional double-stranded DNA (dsDNA) 
degradation, which required both the adeno- 
sine triphosphate (ATP) hydrolysis and HNH 
nuclease functions of the DinG-HNH protein. 
This is the first demonstration of a type IV 
system with a specified interference mecha- 
nism. We characterized two type I systems 
containing HNH nuclease domains inserted in : 
different subunits of Cascade (Cas8-HNH and 
Cas5-HNH). We found that both of these sys- 
tems performed precise dsDNA cleavage and 
single-stranded DNA (ssDNA) cleavage. We ad- 
ditionally observed collateral cleavage of ssDNA 
by the Cas5-HNH system. We demonstrated that 
both systems can be applied for genome editing 
in human cells and that the Cas8-HNH system is 
highly specific. We also studied candidate type 
VII systems, including a minimal Cas7-Cas5 
effector complex and a distinctive interference 
protein including a B-CASP domain. We showed 
that these systems are likely derived from type 
IIIJ-E CRISPR systems and are RNA targeting. 
Other CRISPR-linked systems that we found 
include additional potential effector and adap- 
tation components, two previously unknown 
associations of Mu transposons with CRISPR 
systems, and numerous newly identified pro- 
teins and domains associated with type V sys- 
tems. We also identified an instance of potential 
co-option of a Cas9 as an anti-CRISPR mech- 
anism and noted several non-CRISPR hyper- 
variable regularly interspersed repeat arrays. 


CONCLUSION: This study introduces FLSHclust 
as a tool to cluster millions of sequences quickly 
and efficiently, with broad applications in min- 
ing large sequence databases. The CRISPR- 
linked systems that we discovered represent 
an untapped trove of diverse biochemical 
activities linked to RNA-guided mechan- 
isms, with great potential for development 
as biotechnologies. 
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Microbial systems underpin many biotechnologies, including CRISPR, but the exponential growth 

of sequence databases makes it difficult to find previously unidentified systems. In this work, we 
develop the fast locality-sensitive hashing—based clustering (FLSHclust) algorithm, which performs deep 
clustering on massive datasets in linearithmic time. We incorporated FLSHclust into a CRISPR discovery 
pipeline and identified 188 previously unreported CRISPR-linked gene modules, revealing many additional 
biochemical functions coupled to adaptive immunity. We experimentally characterized three HNH 
nuclease-containing CRISPR systems, including the first type IV system with a specified interference 
mechanism, and engineered them for genome editing. We also identified and characterized a candidate 
type VII system, which we show acts on RNA. This work opens new avenues for harnessing CRISPR 
and for the broader exploration of the vast functional diversity of microbial proteins. 


he discovery of enzymes and natural bio- 
chemical systems advances molecular evo- 
lution studies, shines light on biological 
processes, and provides a starting point 

for the development of molecular tech- 
nologies. Over the past few decades, an enor- 
mous variety of protein families and functional 
systems have been discovered through system- 
atic mining of the rapidly growing nucleic acid 
and protein sequence databases. Many of these 
efforts use protein clustering to group similar 
sequences within large datasets (Fig. 1A). The 
output of these algorithms can then be used to 
inform efforts aimed at deep learning on pro- 
tein sequences, three-dimensional (3D) protein 
structure prediction, and genome mining. One 
prime example of the latter is the discovery of 
previously unknown CRISPR systems, which 
has led to the development of transformative bio- 
technologies and therapeutic approaches (J-4). 
CRISPR systems are microbial RNA-guided 
adaptive immune systems (5). They are com- 
posed of a CRISPR array, which encodes the 
CRISPR RNAs (crRNAs) that give rise to the 
guides; an adaptation module, which integra- 
tes new spacers into the CRISPR array; and an 
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interference module that consists of effector 
components guided by the crRNAs to match- 
ing targets, which are then cleaved. CRISPR 
effectors can be either complexes of Cas pro- 
teins (e.g., Cascade) in class 1 CRISPR systems 
or single, multidomain proteins (e.g., Cas9, 
Cas12, or Cas13) in class 2 CRISPR systems (6). 
This inherent modularity and programmability 
of CRISPR systems has been capitalized on to 
develop a suite of RNA-guided molecular tech- 
nologies, starting with Cas9-mediated genome 
editing (J). 

This toolbox has been expanded through 
computational searches that uncovered many 
CRISPR systems (3, 7-9). However, existing 
methods rely on algorithms that have quadratic 
runtime, such as all-against-all comparisons and 
protein clustering (9), which quickly become 
impractical for mining exponentially growing 
datasets containing billions of proteins (10). 
Linear-scaling clustering methods, such as 
LinClust (17), can address some of these issues 
but produce small clusters of highly similar 
sequences that limit the ability to study deep 
evolutionary relationships. Protein domain 
profiles, such as PFAM, can be used to iden- 
tify broad abundant associations (72), but they 
group remote homologs, which leads to spu- 
rious associations while missing rare ones (1.3). 

To address these limitations and take advan- 
tage of the explosive increase of the known 
structural and functional diversity of proteins, 
we developed fast locality-sensitive hashing- 
based clustering (FLSHclust) (pronounced 
“flash clust”), a parallelized, deep clustering 
algorithm with linearithmic scaling, OCV logN). 
FLSHclust can handle billions of proteins, 
enabling efficient analysis of the vast, exponen- 
tially growing sequence databases. We apply 
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FLSHclust to identify previously uncharacter- 
ized CRISPR systems, including a candidate 
type VII CRISPR system, generating a catalog 
of RNA-guided proteins that expand our under- 
standing of the biology and evolution of these 
systems and provide a starting point for the 
development of new biotechnologies. 


Fast locality-sensitive hashing allows for 
deep clustering of all known proteins at 
terabyte scale 


To address the limitation of quadratic time 
complexity inherent to all-to-all comparisons, we 
sought to use locality-sensitive hashing (LSH)—a 
technique that efficiently groups similar, non- 
identical objects in linear time at the cost of 
false positives and negatives (Fig. 1B) (/4). Using 
this approach, we developed FLSHclust (Fig. 1C 
and fig. SLA). 

FLSHclust first maps each protein to a re- _ 
duced amino acid alphabet, then extracts all 
kmers of length *# (Fig. 1C). An optimal LSH 
family with no false negatives (15) is generated 
using Markov chain Monte Carlo, and for each 
hash function, all hashed kmers are grouped 
into buckets containing similar kmers (Fig. 1D). 
Two representative sequences are then selected 
per bucket, and for all sequences in the bucket, 
a graph edge is formed if an alignment between 
the sequence and each of the representatives 
satisfies the clustering criteria. The resulting 
graph is simplified using a graph degree-aware 
transformation that breaks long chains. Then, a 
community detection is applied to form groups 
of sequences, which are then clustered using 
greedy clustering to produce a final set of clus- 
ters (see fig. SLA for schematic of the complete 
algorithm, fig. SIB for the pseudocode, and the 
supplementary text for additional discussion). 

We benchmarked the performance and scala- 
bility of FLSHclust against several commonly 
used algorithms, namely MMSegqs2, uclust, 
CD-HIT, and LinClust (11, 16-18). First, all algo- 
rithms were assessed on their ability to cluster 
1 million proteins from UniRef50 at 30% se- 
quence identity (Fig. 1E) (11, 16-19). FLSHclust’s 
clustering performance (with two tolerated kmer 
mismatches) approached that of MMSeqs2, 
the top-performing quadratic-scaling algorithm 
(Fig. 1E). Moreover, when considering each set 
of proteins with a given distance to its nearest 
neighbor (Fig. 1E), FLSHclust succeeded in 
clustering a higher proportion of these proteins 
compared with LinClust, another algorithm 
with linearithmic scaling (Fig. 1E). We addi- 
tionally found that FLSHclust produces high 
intercluster distances comparable to MMSegs2, 
demonstrating high-quality cluster representa- 
tives that tend to be no more than 30% se- 
quence identity from one another (fig. S2A). 

To characterize scalability, we benchmarked 
all algorithms on a panel of UniRef50 subsets 
of different sizes using a two-node computer 
grid with 64 central processing units (CPUs), 
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Fig. 1. Design and implementation of FLSHclust. (A) Schematic of applications 


of protein clusteri 
systems that cou 


ng in biology and bioinformatics. Archetypal examples of biological 
id be found with genome mining approaches for CRISPR are 


shown, including CARF proteins and transposon-linked genes. ML, machine learning. 
(B) Conceptual schematic of LSH. In contrast to standard hash-based bucketing, 


LSH allows similar, 


of hash functions 


nonidentical objects to be bucketed together. The specific family 
shown in the example is randomized positional masking (bit 


masking) on sequences. This family functions by dropping specific positions in each 
kmer, where the positions are randomly selected per hash function. (©) Schematic 


of the steps of FLSHclust involving LSH. First, all kmers are extracted from each 
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protein. Then for each hash function, the hash function is applied to all kmers, and 
kmers with the same hash value are grouped and then processed independently 
to determine which sequences will be aligned in the next step. (D) Optimized 
hash functions with no false negatives as calculated using Markov chain Monte 
Carlo compared with standard randomized hash functions from the same family. 
Probability of bucketing two kmers together in one of the L hash tables as a function 
of the number of mismatches between the kmers is shown. The parameters used 
for the LSH family functions are L = 24 hash functions, kmer length k = 12, with 
three positions dropped per hash function. For the optimized hash functions, the 
target number of tolerated mismatches is two, such that the family has no false 
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negatives in identifying matches between kmers with up to two mismatch positions. 
(E) Clustering performance across different algorithms for clustering a 1 million (1M) 
protein subset of the UniRef50 database. Linclust/F refers to linclust using 8001 


kmers per protein, as opposed to the default of 20. FLS| 


r = 2 indicating two tolerated mismatches. Clustering performance shows the 
fraction of proteins that are grouped into a cluster of size 2 or more as a function of 


similarity to their nearest neighbors (NNs). (F) Scaling com 


416 gigabytes of memory, and 2 terabytes of 
solid-state drive storage per node. FLSHclust 
achieved nearly the same average cluster size 
as MMSegs2 at all tested dataset sizes, yet it 
exhibits linearithmic scaling in practice, which 
allows it to run faster than all tested quadratic- 
scaling algorithms on a suitably large dataset, 
such as 10 million proteins (Fig. 1F). Moreover, 
as the size of the input dataset increases, the 
number of clusters produced by FLSHclust 
also increases, with the cluster size exhibiting 
a power law distribution, similar to MMSeqs2 
(fig. S2B). We then compared the clustering per- 
formance of FLSHclust, Linclust, and MMSeqs2 
(which required a large server to complete) on 
the full UniRef50 dataset containing 51 million 
proteins (Fig. 1G) and found that FLSHclust 
clustered 58% more proteins compared with 
Linclust and only 12% fewer compared with 
MMsSegs2, which suggests that FLSHclust can 
achieve a similar clustering performance to 
MMSegs2 even on large datasets. Finally, we 
compared FLSHclust with other clustering algo- 
rithms against various clustering thresholds and 
found that FLSHclust can cluster proteins down 
to 25% sequence identity with corresponding 
interrepresentative distances (fig. S2, C and D). 
Overall, FLSHclust is fully parallelizable and 
can readily scale to large computing infras- 
tructures while exhibiting high computational 
efficiency (fig. S2, E and F). Our FLSHclust 
implementation is also resilient to computation- 
al node or network failures caused by the under- 
lying fault-tolerant Apache Spark framework, 
which allows FLSHclust to use thousands of 
CPUs seamlessly (20). The ability of FLSHclust 
to comprehensively cluster sequences down to 
25% sequence identity while scaling nearly 
linearly with the number of proteins allows 
it to complement other clustering algorithms 
by efficiently operating in dataset regimes ex- 
ceeding millions or billions of proteins. 


Discovery of previously unreported, rare 
CRISPR systems 


We applied FLSHclust to discover rare CRISPR 
systems. CRISPR systems have diverse archi- 
tectures and mechanisms and are divided into 
six types and 33 subtypes (6). To find additional 
CRISPR systems, we developed a sensitive 
CRISPR discovery pipeline that combines 
FLSHclust and CRISPR repeat finders to iden- 
tify deep clusters of proteins stably associated 
with CRISPR arrays (Fig. 2A). We curated a 
database of 8.8 tera-base pairs (Tbp) of 
prokaryotic genomic and metagenomic contigs 
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H refers to FLSHclust, with 


parison of various clustering 


(excluding metagenomic contigs <2 kbp in 
length) from the National Center for Bio- 
technology Information (NCBI), the Whole 
Genome Sequencing (WGS) project, and the 
Joint Genome Institute (JGI) (Fig. 2A). Coding 
sequences were predicted using Genemark 
(21), and CRISPR arrays were predicted using 
previously developed CRISPR finders (22-25) 
and CRONUS, a tool that we developed to detect 
smaller CRISPR arrays that include imperfect 
repeats as well as other repeat arrays with 
hypervariable spacers (see materials and 
methods and fig. $3 for benchmarking). The 
final database contained 8 billion proteins and 
10.2 million CRISPR arrays. Using FLSHclust, 
we iteratively clustered all proteins, resulting 
in 13 billion redundancy-reduced (90% sequence 
identity) clusters and 499.9 million deep 
(30% sequence identity) clusters. In contrast 
to clustering at 50% identity, which produced 
646.4 million clusters, clustering at 30% with 
FLSHclust produced fewer but larger clusters 
(average cluster size of 2.0 versus 2.5 non- 
redundant proteins, respectively), which makes 
them more conducive for estimating evolution- 
ary statistics. 

To identify genes stably associated with 
CRISPR arrays, we computed a CRISPR asso- 
ciation score (naive score) for each 30% cluster 
by calculating the weighted fraction of non- 
redundant proteins encoded in an operon within 
3 kbp of a CRISPR array over the effective sample 
size of the cluster, Neg, which adjusts for contig 
truncations that occur in metagenomic data 
(materials and methods). To capture emerging 
or degrading CRISPR systems—which often only 
contain a single direct repeat (DR) or highly 
diverged DRs (26)—for each CRISPR-associated 
cluster, we selected a representative DR and 
searched its sequence against all other non- 
redundant loci in the cluster (27). The iden- 
tified divergent DR sequences were used to 
compute an enhanced CRISPR association 
score. Finally, to expand our search to find 
genomically distant components of CRISPR 
systems, all proteins considered to be CRISPR 
associated were used as baits for identifying 
additional associated proteins (Fig. 2A). 

To evaluate the performance of this CRISPR 
search pipeline, we compared the naive and 
enhanced CRISPR scores of known CRISPR- 
associated (cas) genes and found that the mean 
naive score of cas genes was 0.44, whereas the 
enhanced score increased to 0.72 (Fig. 2B), 
which highlights the importance of identi- 


fying divergent DRs and mini CRISPR arrays. 
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algorithms and FLSHclust against subsets of UniRef50. (Top) Compute time on two nodes 
each with 64 CPUs. (Bottom) Average cluster size as a function of number of input 
sequences. MMseqs2 on the full UniRef50 dataset required substantially more compute 
resources to complete within a week and thus was not included in the timing analysis. 
Theoretical scaling is shown with big O notation. (G) Comparison of clustering algorithms 
as in (E) except on the full UniRef50 dataset. Additionally, a cumulative distribution across 
all input proteins is shown. Asterisk refers to the clustering threshold of 30%. 


Using the enhanced score, we compared cas 
and non-cas genes and empirically determined 
a cutoff of 0.35, which included most known 
cas genes while removing most non-cas genes 
(Fig. 2C). We then applied this filter to all 
protein clusters with an effective sample size 
Nege = 3, resulting in ~130,000 clusters with 
associations to CRISPR-like repeats (out of 
16 million total clusters with Neg = 3). After 
manual curation, we identified 188 previously 
unreported CRISPR-linked systems, many of 
which included proteins or domains not pre- 
viously linked to CRISPRs. All systems identi- : 
fied in the complete analysis, including those 
previously known, are provided in the supple- 
mentary materials (table S1, sequences for 
manually curated set in data S2 and S3, and 
protein-protein associations in data S4; see 
table S2 for equivalences of Cas legacy names). 
Using only the naive score with 50% clusters, 
we recovered 51 fewer systems, with an addi- 
tional 12 losses if only CRT (CRISPR recogni- 
tion tool) (23) was used for identifying CRISPR 
arrays, which underscores the sensitivity of the 
complete pipeline (table S3). 

The abundance and distribution of different 
CRISPR systems is uneven across sequenced 
bacterial and archaeal genomes (6, 28). To gauge 
how the increasing diversity of sequencing 
data correlates with the CRISPR-Cas diversity 
detectable with our pipeline, we back-calculated 
the time at which clusters (with a minimum of 
two nonredundant CRISPR-associated loci) 
appeared in the public dataset for various 
CRISPR-Cas subtypes of note (Fig. 2D and data 
S1). These calculations track with the abun- 
dance of cas genes, highlighting the importance 
of diverse environmental sampling for discov- 
ering biochemical, mechanistic, and functional 
diversity of CRISPR systems. Notably, many of 
the systems that we identified are rare and 
appeared in the dataset only recently—during 
the past decade. These include various class 
1-derived systems, such as a type IV-derived 
system containing a DNA damage-inducible 
gene G helicase (DinG)-HNH fusion effector, 
type I-derived systems containing Cas8-HNH 
and Cas5-HNH fusion effectors, a candidate 
type VII system, and CRISPR-linked transposons, 
some of which we experimentally characterized. 


DinG-HNH is a type IV-A variant with 
directional, double-stranded DNA 
nuclease activity 


First, we examined the type IV-A variant with an 
HNH nuclease domain inserted at the C-terminal 
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Fig. 2. Discovery of hundreds of rare previously undiscovered CRISPR 
systems with a sensitive, scalable CRISPR association pipeline. (A) Schematic 
of CRISPR discovery pipeline using no all-to-all comparisons. EMBL, European 
Molecular Biology Laboratory; MG-RAST, Metagenomic Rapid Annotations using 
Subsystems Technology; 8B, 8 billion. (B) Comparison of naive and enhanced 


CRISPR association scores for identifying CRISPR-associated 
Known Cas genes. (Right) All clusters. N CRs indicates the n 
nonredundant loci with CRISPR arrays. (C) Selection of CRIS 


clusters. (Left) 
umber of 
PR-associated 


clusters. (Left) Relative count of Cas (blue) versus non-Cas (gray) clusters as a 


function of enhanced CRISPR association score. An empirica 
enhanced score was selected for identifying CRISPR-associat 
Relative count of all clusters with Nem = 3. Dotted line dema 


threshold of 0.35 
ed clusters. (Right) 
cates the 0.35 


enhanced score cutoff. In total, ~130,000 clusters with an enhanced score of 


20.35 passed for further analysis. (D) (Top) Line graph showing the number of 
proteins over time in the complete dataset, including all projects from public data 
(JGl, NCBI, WGS, and EMBL, excluding MG-RAST). (Bottom) Back-calculated 

times at which CRISPR-associated, nonsingleton protein clusters appeared in the 
public dataset for selected systems. Cluster assignments are fixed across time 


using the 30% sequ 


ence identity clustering from FLSHclust. The appearance 


time of a cluster is the earliest time at which a minimum of two nonredundant, 
CRISPR-associated proteins from the cluster are present in the public dataset. 
The appearance time of a system (e.g., Cas9, etc.) is the earliest appearance 
time across all related clusters. For multigene systems, a signature gene was 
used to represent the entire system (type |, Cas7; type Ill, Csm3; type IV, Csf2). 
The inferred appearance time value is an upper bound for the true CRISPR- 
associated cluster appearance time in the dataset. 


end of the CRISPR-associated DinG-like DEAD/ 
DEAH-box helicase (Fig. 3A). Type IV systems 
appear to have evolved from active type III sys- 
tems but are poorly characterized, with no docu- 
mented mechanism of action (29-31). The 
insertion of the HNH domain into the DinG 
protein could reflect an evolutionary trajectory 
from a type IV system that lost the capacity to 
cleave DNA back to a system fully capable of 
adaptive immunity and interference (Fig. 3A) 
(32). We hypothesized that the HNH domain 
could mediate target cleavage through an un- 
winding and cleavage mechanism analogous to 
the processive target cleavage by Cas3 (33, 34). 
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To test this, we heterologously expressed the 
DinG-HNH system in Escherichia coli along 
with a CRISPR array encoding a reprogrammed 
spacer sequence targeting a protospacer adja- 
cent to an 8N randomized library (35). We 
observed depletion of 5’ YCN protospacer- 
adjacent motifs (PAMs) (Fig. 3B), indicating 
that the system is capable of programmable 
PAM-dependent RNA-guided plasmid inter- 
ference activity. Small RNA sequencing (RNA- 
seq) of the heterologously expressed operon 
and associated CRISPR array revealed processed 
crRNAs containing a 30-nucleotide (nt) spac- 
er (Fig. 3C) (36, 37). 
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To validate the observed activity, we per- 
formed a plasmid transformation efficiency 
assay and compared transformation efficiency 
of a target plasmid in cells containing the com- 
plete operon with those containing an empty 
vector control. We found that transformation 
efficiency decreased by 3 orders of magnitude 
when both the complete operon and correct 
PAM were present (Fig. 3D). Through system- 
atic deletion of each protein, we found that all 
five components of the effector complex were 
required for interference activity (Fig. 3D). 
Furthermore, mutation of the conserved nega- 
tively charged residues of the Walker B motif 
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Fig. 3. Type IV-A CRISPR systems perform directional dsDNA unwinding 

and strand-specific cleavage. (A) Locus diagram of the experimentally studied 
DinG-HNH system from Sulfitobacter sp. JLO8. (B) Sequence logo for the PAM of 
DinG-HNH as determined by a plasmid depletion assay in E. coli. (©) Small RNA- 


seq of DinG-HNH effector complex RNP pulldown. (D 


assays with DinG-HNH and associated effector complex genes and cognate 


(D139, E140) and the catalytic triad of the HNH 
domain (H497, D514, H523) in the dinG gene 
abolished activity, which implies that both 
adenosine triphosphate (ATP) hydrolysis and 
HNH nuclease activity are required for inter- 
ference (Fig. 3D) (38). 

To characterize the biochemical mechanism 
of the observed interference activity, we recom- 
binantly expressed and affinity purified both 
the effector ribonucleoprotein (RNP) complex 
and DinG-HNH protein (fig. S4A). When all 
components were combined with a linear double- 
stranded DNA (dsDNA) target, we observed a 
ladder of cleavage products on a denaturing 
gel (fig. S4B), indicating movement of the DinG 
helicase along the target DNA. To test whether 
this movement was directional, we constructed 
two linear dsDNAs with the target site placed 
near either the 5’ or 3’ end of the target strand 
(TS) (Fig. 3E and fig. S4D). We observed activity 
only when the target site was positioned close 
to the 3’ end of the TS, which suggests a model 
in which DinG loads to the nontarget strand 
(NTS) within the R loop and moves in the 5'—3' 
direction along the NTS while continuously cleav- 
ing both the TS and the NTS (Fig. 3F) (38, 39). 

Together, these results suggest that the role 
of the DinG helicase-nuclease in these type IV 
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E. coli transformation 
interference. 


systems is analogous to that of the Cas3 effect- 
or protein in type I CRISPR systems, whereby 
a helicase and a nuclease act in conjunction to 
unwind and shred the target. However, the 
helicase moieties of the DinG-HNH and Cas3 
are only distantly related, whereas the nucleases 
are unrelated, which indicates that this mecha- 
nism likely evolved twice independently. 


Type | Cascade components are 
functionalized with HNH domains for precise 
dsDNA cleavage 


We also identified two previously unknown 
variants of type I CRISPR systems containing 
an HNH nuclease domain inserted into one of 
the Cascade backbone components, either cas8 
or cas5, but most examples of which lack cas3 
(Fig. 4, A and B). The Cas8-HNH system con- 
sists of four genes and is most closely related 
to type I-F1 CRISPR systems, whereas the Cas5- 
HNH system consists of five genes and is most 
closely related to type I-E CRISPR systems. In 
some cases, the cas8 was additionally fused to 
casi, and in other rare cases, remnants or trun- 
cations of cas3 appeared in the vicinity, suggest- 
ing that cas3 progressively disappeared from the 
system (data S2). On the basis of the absence 
of the cas3 helicase-nuclease gene along with 
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targets with or without the PAM identified in (B). (E) In vitro reconstituted DinG- 
HNH and associated effector complex RNP cleavage of linear dsDNA targets. 
Targets either contain the cognate target site at the 5' or 3' end of the TS as 
indicated. Only targets on the 3’ end of the TS are cleaved. TBE, tris-boric 
acid-EDTA buffer. (F) Model for the mechanism of DinG-HNH-mediated DNA 


the previously unreported association of an 
HNH domain, we hypothesized that both these 
systems might enable precise RNA-guided dsDNA 
cleavage, in contrast to the processive degra- 
dation activity exhibited by Cas3 in canonical 
type I systems (33, 34). 

To test this, we performed a PAM discovery 
assay in FE. coli and observed depletion of spe- 
cific PAMs for both systems (Fig. 4, C and D), 
which suggests that both are capable of RNA- 
guided interference activity. Small RNA-seq of the 
recombinantly purified Cascade RNPs showed 
that Cascade binds to crRNAs in each system, 
both containing 32-nt spacers (Fig. 4, E and 
F) (40). 

Next, we confirmed the ability of the Cas8- 
HNH and Cas5-HNH Cascade RNPs to cleave 
dsDNA in a precise, PAM-dependent manner 
(Fig. 4, Gand H, and fig. $5). Sequencing of the 
cleavage products for each system showed that 
Cas8-HNH cleaves the TS and the NTS 5 bp 
and 2 bp downstream of the protospacer, re- 
spectively, on the PAM-distal end of the target, 
generating 5’ overhangs (Fig. 41). By contrast, 
Cas5-HNH cleaves the TS and the NTS 3 to 
4 bp and 8 bp downstream of the protospacer, 
respectively, on the PAM-distal end, generat- 
ing 3’ overhangs (Fig. 4J). 
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Fig. 4. HNH-functionalized Cascade subunits perform precise, RNA-guided 
dsDNA cleavage. (A) Locus diagram of the experimentally studied Cas8-HNH 
system from Selenomonas sp. isolate RGIG9219. (B) Locus diagram of the 
experimentally studied CasS-HNH system from Candidatus Cloacimonetes 
bacterium. (€) Sequence logo for the PAM of Cas8-HNH as determined by a 
plasmid depletion assay in E. coli. (D) Sequence logo for the PAM of Cas5-HNH 
as determined by a plasmid depletion assay in E. coli. (E) Small RNA-seq of 
Cas8-HNH Cascade RNP pulldown. (F) Small RNA-seq of Cas5-HNH Cascade 
RNP pulldown. (G) In vitro reconstituted Cas8-HNH Cascade RNP cleavage of 
linear dsDNA targets, in the presence or absence of a cognate target and/or 
PAM. (H) In vitro reconstituted Cas5-HNH Cascade RNP cleavage of linear 
dsDNA targets, in the presence or absence of a cognate target and/or PAM. 


Given that HNH domains have been observed 
to cleave only a single strand in targeted dsDNA 
(26, 41), we tested both systems for single- 
stranded DNA (ssDNA) cleavage activity. We 
observed that both the Cas8-HNH (fig. S5C) and 
the Cas5-HNH systems (fig. S5D) can cleave 


Altae-Tran et al., Science 382, eadil1910 (2023) 


genome edit 
HNH domain 


nontargeting 


domain cata 
guide conditi 


on. 


ssDNA in a PAM-independent manner. We 
additionally found that the Cas5-HNH system, 
but not the Cas8-HNH system, exhibited colla- 
teral cleavage of ssDNA substrates stimulated by 
dsDNA and ssDNA targets in a PAM-dependent 
and PAM-independent manner, respectively (fig. 
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S5, E and F). This is the first reported obser- 
vation of collateral activity in a type I CRISPR- 
Cas system, which suggests convergent evolution 
of this mechanism. 

Finally, we tested whether Cas8-HNH and 
Cas5-HNH can programmably generate short 
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insertions or deletions (indels) in mammalian 
cells. We found that both systems are capable 
of inducing indels with varying efficiencies up 
to ~13% (Fig. 4, M and N, and table S4). For 
Cas8-HNH, all protein subunits were required 
for activity (Fig. 4M). For the Cas5-HNH system, 
the Cas11/Cse2 subunit was dispensable for indel 
formation, but its deletion resulted in reduced 
activity (up to ~6%), whereas deleting Cas7 re- 
sulted in minimal activity (up to ~1%). Deleting 
any of the other components ablated activity 
(Fig. 4N). Inactivation of the catalytic residues of 
the HNH domain in each system also abolished 
activity, demonstrating that the HNH domain 
mediates target cleavage in both systems (Fig. 4, 
M and N). To assess the genome-wide specificity 
of cleavage, we performed tagmentation-based 
tag integration site sequencing (42). For Cas8- 
HNH, we detected no off targets for the four 
tested guides, which suggests that this system is 
highly specific (fig. S5G). The 3’ overhangs gener- 
ated by Cas5-HNH cleavage were not compatible 
with blunt-end ligation required for this assay. 


A candidate type VII CRISPR system is a 
precise RNA-guided RNA endonuclease 
complex containing a B-CASP nuclease 


CRISPR systems evolve through modular re- 
placement of Cas components and subdomains, 
as exemplified by the DinG-HNH, Cas8-HNH, 
and Cas5-HNH systems characterized above. 
We further identified a distinct system present 
in diverse archaea containing a B-CASP nuclease 
domain protein. This protein is encoded ina 
predicted operon with Cas7 and Cas5, which, 
together, may form a minimal effector complex, 
and in some cases, a Cas6, which is involved in 
crRNA processing in other CRISPR-Cas systems 
(Fig. 5A, fig. S6A, and table S5) (43). The Cas5 
and the Cas7 of this system are distantly related 
to the type III-D Cas5 and Cas7 proteins, re- 
spectively, with an apparent inactivation of the 
Cas7 catalytic residues that are required for 
target RNA cleavage in type III systems (Fig. 5B 
and fig. S6, B to E, H, and I). 

The B-CASP domain is an ancient nuclease 
fold found in all domains of life that exhibits 
RNA endonuclease, 5’ to 3’ RNA exonuclease, 
and/or DNA nuclease activities in various con- 
texts (44). B-CASP domain proteins are involved 
in nonhomologous end-joining DNA repair 
(NHEJ), V(D)J recombination, RNA surveillance, 
mRNA and ribosomal RNA (rRNA) maturation, 
and RNA decay (45-49). Phylogenetic analysis 
of the B-CASP family supports the origin of the 
CRISPR-associated members from a distinct, 
well-defined clade (Fig. 5C and fig. S6F). Struc- 
tural modeling of the B-CASP protein with 
AlphaFold2 (50) shows two distinct domains, 
namely, the N-terminal B-CASP domain (figs. 
87 and S6G), and a C-terminal adaptor domain 
with structural similarity (but no detectable 
sequence similarity) to the ~200-amino acid 
(aa) C-terminal domain of Cas10 (Fig. 5D), the 
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large subunit of type III systems that is in- 
volved in target RNA interaction (57). Given its 
distinctive domain composition and associ- 
ation with CRISPR, we propose to designate 
the B-CASP domain protein of these systems 
Casl4—the next structurally distinct effector 
complex component after Cas12 and Cas13. 

Searching for protospacer matches to the 
CRISPR spacers in these systems revealed a 
pronounced bias toward the antisense strand 
of matching target sequences (Fig. 5F and data 
S5), which suggests that these systems target 
RNA. We further observed that spacers primarily 
target transposon genes, indicating that the 
system could defend against actively expressed 
transposons, unlike other known CRISPR types, 
which primarily target viruses or plasmids (Fig. 
5G and fig. S8). 

We hypothesized that the Cas14-containing 
system carries out interference through the 
B-CASP nuclease domain, in contrast to the 
distantly related CRISPR subtype III-E, which 
also likely originated from subtype III-D but 
retains a Cas7-based interference mechanism 
(6, 52, 53). We further identified a previously 
unknown type III subtype that, similar to the 
Cas14-containing system, encompasses a single 
Cas7-like and a Cas5-like gene distinct from 
those of the Cas14-containing system (fig. S9A). 
However, these systems also include a Cas10 
with an active phosphohydrolase nuclease do- 
main and an inactivated polymerase domain 
(fig. S9B). Thus, this type III subtype is pre- 
dicted to cleave target DNA but lacks the cyclic 
oligoA-dependent signaling pathway that is 
integrated in many other type III systems. 
These findings together point to convergent 
evolution of minimal effector complexes. 

Purification and small RNA-seq of type VII 
Cas7-Cas5 RNP complexes showed that Cas7 
and Cas5 form a complex that copurifies with 
a processed crRNA containing both a 5’ and 3’ 
DR tag, similar to type I and IV systems (Fig. 
5H) (36, 37, 40). The complex is stable only in the 
presence of the corresponding crRNA (Fig. 51). 
To test cleavage activity, we separately purified 
Cas14 and mixed it with the purified Cas7-Cas5 
RNP complex and labeled target RNA. We 
observed precise target RNA cleavage only in 
the presence of all proteins and the cognate 
target sequence (Fig. 5J and fig. S10). Inac- 
tivation of key residues in the predicted Zn(ID 
binding pocket of the Cas14 B-CASP domain 
abolished cleavage activity (Fig. 5J). Together, 
these results suggest that Cas14 is the nuclease 
effector in these systems. Given the distant 
relationship between the effector complex of 
the Casl4-containing system and those of 
other known CRISPR types, and the substi- 
tution of the effector nuclease with an un- 
related nuclease, B-CASP, we propose that the 
Cas14-containing system be classified as type 
VII CRISPR-Cas (see fig. S11 for further com- 
parison across CRISPR types). 
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Putative undiscovered CRISPR variants and 
CRISPR-associated genes 

Our biodiscovery pipeline identified many addi- 
tional putative undiscovered systems (Fig. 6, 
figs. S12 to S14, and data S2). In total, we 
identified 188 CRISPR-linked gene modules that, 
to the best of our knowledge, have not been 
reported previously (fig. S14, A to GF, and data 
$2). These systems have been designated as 
UAS-# (unknown associated system) and may 
each contain multiple genes (designated uas#A, 
uas#B, etc., if not previously named). From 
these findings, several themes emerged. First, 
we identified at least 17 cases where the core 
effector modules contained newly identified 
domains or fusions, including the DinG-HNH, 
Cas8-HNH, Cas5-HNH, and candidate type 
VII systems (Fig. 6A). We also discovered a 
VRR-NUC [PD(D/E)XK superfamily] nuclease 


fused to Cas11 subunit in I-E systems. Apart : 


from these inserted domains, we identified a 
type I-B variant with a fusion of Cas5 to Cas3, 
which might allow direct loading of Cas3 
to the target DNA upon its recognition by 
Cascade. Similarly, we found a Cas8-Cas5 fu- 
sion in an incomplete type I-C system that 
apparently lacks Cas3 and may function as a 
DNA binder. 


CRISPR-associated transposons 


A second, related theme is the association of 
newly identified genes with core CRISPR effector 
modules, which is consistent with previous 
studies showing that the RNA-guided mechan- 
ism of CRISPR has been repurposed for dif- 
ferent functions (Fig. 6A) (54-56). For example, 
we discovered Mu transposases (57) associated 
with type V and type I-A systems (designated 
CasMu-V and CasMu-l, respectively) in which 
the effector nuclease activity was lost, either as 
aresult of the apparent catalytic inactivation 
of Cas12 through the loss of the RuvC-III motif 
(type V) or through the loss of the entire cas3 
gene (type I). CasMu-I is additionally associated 
with an HTH domain-containing protein and 
a gene denoted casmuC, which encodes an 
inactivated paralog of the associated MuA trans- 
posase. Using AlphaFold2 (50), we predicted 
interaction between the CasMuC protein and 
Cas8, which suggests that CasMuC may serve 
as an adaptor between the transposase and 
the CRISPR effector complex (fig. S15). Using 
sequence alignments, read mapping, and com- 
parison with other Mu transposon ends, we 
identified the left and right ends of the trans- 
poson for both classes of CasMu systems. In 
one example of CasMu-V, we further identified 
a cryptic homing spacer in the CRISPR array 
matching a site 68 bp downstream of the right 
end, suggesting an RNA-guided homing mech- 
anism (Fig. 6A and fig. S16) (58). Thus, CasMu-V 
and CasMu-I appear to be distinct CRISPR- 
associated transposons that use interference- 
defective CRISPR systems for reprogrammable 
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Fig. 5. Candidate type VII CRISPR system. (A) Locus diagram of the 
experimentally studied candidate VII system. (B) UPGMA (unweighted pair group 
method with arithmetic mean) dendrogram from HHPred pairwise alignment 
scores of related Cas7s. (C) Phylogenetic tree (FastTree) of B-CASP proteins 
from both bacteria and archaea, including the B-CASP proteins linked to the 
candidate type VII system, which form a distinct clade. (D) (Top) Diagram of the 
domain architecture of Cas14. (Bottom) Superposition of Cas14’s C-terminal 
domain with the Casl0's C-terminal from Protein Data Bank ID 6NUD showing 
the Casl0 interface with the target RNA. Both share the four-helix bundle found 
in Casl0 and Cas11 that are known to interact with the TS. (E) Coding sequence 


(CDS) TS preferences of the protospacer matches for the CRISPR array of the 
experimentally studied type VII locus. (F) Targets of the protospacer matches for 
the CRISPR array of the experimentally studied type VII locus. (G) Small RNA-seq 

of type VII Cas7-Cas5 RNP pulldown along with the DR sequences. (H) Size- 
exclusion chromatography of the Cas7-Cas5 copurified with an expressed DR + « 
spacer + DR or copurified with an expressed truncated DR + truncated spacer. 
(I) In vitro reconstituted Cas14 and associated effector complex RNP cleavage 

of Cy5-labeled RNA targets, in the presence or absence of cognate target 
sequences. (D66A/H67A) represents mutation of key residues in the predicted 
catalytic Zn(Il) binding pocket of Casl4 to alanine. 


RNA-guided transposition, a mechanism that 
was previously known to exist only for trans- 
poson 7 (Tn7)-like transposons (54). 


Multicomponent Cas12-linked systems 


In addition to transposon association, we iden- 
tified several further examples of previously 
unknown associations with core CRISPR effect- 
or modules. These included combinations of 
Cas12 with proteins, such as Cas3, OMEGA-IscB 
and an HTH domain, and a TPR-DUF3800 
domain-containing protein (Fig. 6A). The Cas12- 
Cas3 system is a putative class 1-class 2 hybrid 
system in which a Cas12m, which is not known 
to exhibit DNA cleavage activity (59), may have 
associated with a Cas3 helicase-nuclease (type 
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I-C-like) to enact an interference mechanism 
beyond DNA binding. The Cas12 associated 
with an OMEGA-IscB and an HTH domain pro- 
tein is inactivated, whereas the associated IscB 
protein has an inactivated RuvC domain and an 
active HNH domain, which suggests that it func- 
tions as a nickase; these two RNA-guided mod- 
ules may work in concert to facilitate targeting 
or in opposition to exclude each other under cer- 
tain conditions. We also found that a subbranch 
of Cas12a2 is associated with a TPR + DUF3800 
domain protein and occasionally with a UvrD 
helicase and an additional tetratricopeptide re- 
peat (TPR) domain-containing protein. AlphaFold2 
prediction (50) of the DUF3800 domain- 
containing protein indicated that DUF3800 con- 
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tains a ribonuclease H (RNaseH) nuclease fold 
with a catalytic rearrangement (fig. S17). Addi- 
tionally, the DUF3800 domain has previously 
been found to be associated with putative 
noncoding RNAs (ncRNAs) (60). Together, this 
suggests that it may function as part of the 
interference module or in crRNA biogenesis 
or degradation in these systems. The pres- 
ence of multiple TPR domains, which facilitate 
protein-protein interactions (67), suggests inter- 
action between the various components of 
these systems, possibly with consequences for 
the interference mechanism. 

We tested several of these newly identified 
type V systems (CasMu-V, Cas12+TPR-DUF3800, 
Cas12+TPR-DUF3800+UvrD+TPR, Cas12+IscB, 
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and Cas12+Cas3) for ncRNA binding by the 
Cas12 effectors by purifying Cas12 proteins and 
sequencing any associated RNA. We found that 
all of these Cas12s copurified with a cognate 
ncRNA, usually a processed crRNA derived from 
the associated CRISPR array (fig. S18), which 
suggests that these are functional CRISPR 
systems in which Cas12 operates as an RNA- 
guided targeting module. 


Biomimicry anti-CRISPR strategy used 
by viruses 


We next examined the dataset to identify homo- 
logs of Cas proteins that have lost CRISPR array 
association. We found a type II-C Cas9 with a 
catalytically inactivated RuvC nuclease domain 
but an active HNH domain, which is encoded 
in phage genomes and associated with an SNF2 
helicase but not with CRISPR arrays (score of 
0) (Fig. 6A and fig. S19A). A putative trans- 
activating crRNA (tracrRNA) was found in the 
vicinity of this phage type II locus. For one of 
these systems, we identified the correspond- 
ing host bacterium in the same sequencing sam- 
ple, which encoded its own type II-C CRISPR-Cas 
system with a catalytically active Cas9 (fig. SI9B). 
Among the spacers in the host CRISPR array, 
there were four matches to the corresponding 
phage system (fig. S19, C and D). The phage- 
encoded tracrRNA contained a perfect antirepeat 
to the host DRs, such that these two RNAs are 
predicted to form a more stable complex com- 
pared with the host tracrRNA:crRNA complex 
(fig. SI9E). Along with the structural similarity 
of the two CasQs (fig. S19, F and G), these obser- 
vations suggest that the phage Cas9 could derail 
the host CRISPR system by forming stable com- 
plexes with the crRNAs, which is a distinct mech- 
anism that further adds to the notable diversity 
of anti-CRISPR strategies used by viruses (62, 63). 


Diverse auxiliary and adaptation-linked 
CRISPR genes 


Apart from variations on the effector modules, 
a third emerging theme is linkage between genes 
not previously known to associate with CRISPR 
and CRISPR adaptation modules. For example, 
we found Cas adaptation modules linked with 
RNaseH (UAS-3, UAS-45) and DNA polymerases 
(UAS-4, UAS-15) as well as a variety of unexpected 
genes, such as transmembrane domain pro- 
teins (Fig. 6B and fig. S14, U to AS). In addition, 
we identified numerous CRISPR-associated 
Rossmann fold (CARF) domain-containing puta- 
tive effectors in the vicinity of type III CRISPR 
loci, including two-component RNAPol + CARF 
(UAS-58), pppGpp hydrolase + RelA systems 
(UAS-50), and ternary complex VWA-MoxR- 
VMAP coupled domains (UAS-55, UAS-64, 
and UAS-66), suggesting diverse mechanisms 
of CRISPR-activated signaling cascades poten- 
tially linked to other cell stress pathways (Fig. 
6C) (64). We found that diverse vWA-related 
systems associate more broadly with CRISPR loci 
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alongside kinase, phosphatase, transmembrane, 
and tubulin domain proteins (UAS-7, UAS-87, 
UAS-91, UAS-100, UAS-129, UAS-139, UAS-149, 
and UAS-155). Additionally, a variety of putative 
regulatory, signaling, and nucleic acid-binding 
proteins were found to be associated with both 
class 1 and class 2 systems as well as numerous 
toxin-antitoxin modules that could safeguard 
cas genes, as previously described for some 
type I systems, or otherwise interact with the 
CRISPR machinery (Fig. 6D) (65, 66). We also 
identified large CRISPR-associated genes encod- 
ing functionally uncharacterized giant multi- 
domain proteins (>3000 aa), one of which, M1, 
contains multiple DNA-interacting domains 
(Fig. 6D). 


Hypervariable, regularly interspersed repeat 
array systems 


Finally, we identified putative undiscovered 
functional systems associated with regularly 
interspaced repeat arrays with hypervariable 
spacers, analogous to CRISPR arrays and mRNA 
arrays (26) but lacking any cas genes (fig. S14, GJ 
to GO). These systems are distinct from CRISPR 
but might contain previously unknown modular 
functions as previously observed for hyper- 
variable repeat proteins (67, 68). We identified 
six systems containing predicted nucleic acid- 
interacting proteins associated with other, non- 
CRISPR interspaced repeat arrays (fig. S14, GJ 
to GO, and fig. S20A). One of these systems 
included an AddB-like PD(D/E)XK family 
nuclease-helicase with an inactivated helicase 
domain associated with CRISPR-like repeats 
that are preceded by a predicted conserved pro- 
moter, which suggests that the array is ex- 
pressed. We performed small RNA-seq on 
E. coli harboring plasmids carrying these sys- 
tems and found that they expressed small RNAs 
overlapping the repeats and hypervariable 
spacer regions of the arrays (fig. S20B). 

Asecond system included a GGDEF domain 
[cyclic di-guanosine monophosphate (GMP) 
synthetase] and a major facilitator superfamily 
(MEFS) transporter, with an interspersed repeat 
array encoded between them, along with addi- 
tional phospholipase, LCP phosphotransferase, 
and HTH domain proteins (fig. S20A). We per- 
formed small RNA-seq on native organisms 
harboring GGDEF loci and observed trans- 
cription across the identified repeat arrays, 
with apparent processing of the RNA (fig. S20C). 
By analogy with the Cas10 protein of type II 
CRISPR systems, which contains a divergent 
GGDEF domain that, in response to virus 
infection, produces cyclic oligoadenylate that 
activates downstream effectors, these GGDEF- 
containing systems could also produce a second 
messenger activating an RNA-guided compo- 
nent of the system. Thus, these systems gen- 
erally resemble CRISPR and might represent a 
previously unknown RNA-guided mechanism 
with defense or other functions. 
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Systems associated with tRNA arrays with 
variable spacers 

We further identified three systems associated 
with interspaced tRNA arrays separated by 
similarly sized variable sequences that could 
modulate the function of the tRNAs through 
mechanisms such as differential expression or 
processing of individual tRNA units (fig. S14, 
GG to GI, and fig. S21). This is consistent with 
the association of some of these tRNA arrays 
with nucleic acid-processing enzymes, such as 
RNaseR, RNaseH, and DNA Pol III epsilon-like 
exonuclease. Overall, these systems might repre- 
sent diverse functions beyond CRISPR that use 
repeat arrays with hypervariable spacers to carry 
out defense and/or regulatory functions. 


Discussion 


The continuing and accelerating proliferation 
of public sequence data has the potential to : 
transform biology, but realizing this potential 
requires computational approaches that can 
keep pace with database growth. Central to 
this effort is moving away from all-to-all com- 
parisons. In this work, we used LSH to develop 
FLSHclust—an algorithm for clustering proteins 
by sequence similarity that, unlike the currently 
available methods, can quickly and efficiently 
cluster millions of sequences and will be ap- 
plicable to a broad variety of studies that in- 
volve mining large databases. We applied 
FLSHclust to identify numerous previously 
unreported CRISPR systems and associated 
genes. The systems we identified are rare, with 
many encompassing only a single cluster out 
of the ~130,000 CRISPR-linked clusters that 
we identified, which indicates that the high- 
throughput approach we applied is indispensable 
for the discovery of previously unknown CRISPR 
variants as well as rare variants of other func- 
tional systems. To identify CRISPR-linked genes, 
we used the association score, which we refined 
during this work, with a conservative cutoff. Any 
such cutoff may lead to false negatives, but given 
the vast amount of data analyzed, we focused on 
the most reliable predictions. The discovery of 
previously unknown cas genes and CRISPR sys- 
tems substantially expands the known CRISPR 
diversity, emphasizing the functional versatility 
of CRISPR whereby previously undiscovered pro- 
teins and domains are often recruited, either 
replacing preexisting components or conferring 
newly identified functions to the preexisting 
scaffold of Cas proteins (Fig. 6E). 

We observed many previously unknown do- 
mains and proteins associated with CRISPR 
effector modules, several of which appear to 
compensate for the functions of lost components 
(Fig. 6A), which highlights the modular evolu- 
tion of CRISPR effectors. We identified HNH 
nuclease domains as additions to preexisting 
CRISPR systems on three independent occasions: 
DinG-HNH, Cas5-HNH, and Cas8-HNH (Figs. 3 
and 4). The evolution of these systems mimics 
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the origin of type II CRISPR systems, in which an 
HNH nuclease was inserted into the RuvC-like 
nuclease domain of the IsrB protein to become 
IscB, the likely direct ancestor of Cas9 (Fig. 6E) 
(26). Another notable case is the candidate type 
VII CRISPR system that we discovered, in which 
the enzymatic domains of Cas10 were function- 
ally replaced by the unrelated B-CASP nuclease 
(Fig. 5). Although the B-CASP-containing CRISPR 
systems appear to be distantly related to and 
most likely derived from type III CRISPR sys- 
tems (fig. S6C), which also appears to be the 
case for type IV systems (29), the limited se- 
quence similarity among the shared components 
(fig. S6, H and I) and the recruitment of a distinct 
interference effector suggests classification of 
these systems as type VII. Similarly, the dis- 
covery of a broad variety of proteins and domains 
associated with CRISPR adaptation modules 
(Fig. 6B) suggests the existence of many func- 
tional and mechanistic variations in this first 
stage of the CRISPR function. CRISPR systems 
can also be co-opted for other RNA-guided 
functions, such as transposition (69), and this 
work extends this form of exaptation beyond 
Tn7-like transposons through the discovery of 
CasMu-I and CasMu-V. 

Taken together, the results of this work reveal 
unprecedented organizational and functional 
flexibility and modularity of CRISPR systems 
but also demonstrate that most variants are rare 
and are only found in relatively unusual bacteria 
and archaea. Apparently, during the billions of 
years of the evolution of prokaryotes, a limited 
number of fittest variants spread broadly by 
horizontal transfer, preventing extensive dis- 
semination of most of the emerging vari- 
ants. The causes of the higher fitness of those 
(relatively) few successful variants are a major 
question for future studies. 

Because of the ability of CRISPR-Cas systems 
to programmably sense specific nucleic acids 
and subsequently enact enzymatic functions, 
the discovery and characterization of previously 
unknown CRISPR effectors and downstream 
auxiliary functions has the potential to enable 
a wide range of applications and to improve 
existing CRISPR-based technologies. In this 
work, we characterized the genome-editing 
activities of Cas8-HNH and Cas5-HNH nucleases, 
which showed notable precision and hold pro- 
mise for further development as genome-editing 
tools. The Cas5-HNH system may also have ap- 
plications in diagnostics given its collateral 
cleavage activity. Beyond genome editing, CRISPR 
adaptation machinery has emerged as a power- 
ful tool for molecular recording, highlighting the 
importance of identifying previously uncharac- 
terized biochemical functions associated with 
the adaptation genes to expand the function 
and scope of such technologies (70, 77). CRISPR- 
associated CARF/SAVED domain effectors could 
be developed as sensitive molecular sense- 
and-respond tools because they enact diverse 
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enzymatic functions that are allosterically acti- 
vated by cyclic oligonucleotide binding by the 
CARF/SAVED domain, which is in turn a re- 
sponse to targeted RNA recognition (72-75). 
Notably, we report the first identification of 
multicomponent CARF/SAVED systems, which 
suggests that these systems engage in natural, 
multiprotein signaling cascades that could be 
further adapted for biotechnology. This repre- 
sents only a small fraction of the discovered 
systems, but it illuminates the vastness and 
untapped potential of Earth’s biodiversity, 
and the remaining candidates will serve as a 
resource for future exploration. 


Materials and methods summary 


A complete materials and methods section is 
provided in the supplementary materials. 


FLSHclust implementation 


The FLSHclust algorithm was implemented in 
Python 3 using PySpark for distributed com- 
putation on clusters without shared memory 
or disk. The algorithm is visually depicted in 
fig. S1. Complete details and benchmarking com- 
parisons are described in the materials and 
methods. 


Sensitive CRISPR discovery pipeline 


For CRISPR prediction, four CRISPR finders 
[PILERCR (22), CRT (23), CRISPRFinder (24), 
and CRONUS] were used with a total of six runs 
based on parameter combinations selected from 
a calibration against the synthetic CRISPR array 
benchmark. CRISPR array predictions from the 
various CRISPR finders were deduplicated by 
grouping in intervals, and the best CRISPR 
from each interval was selected. Operons were 
then defined from predicted proteins in each 
contig, and operonic distance from each operon 
to CRISPR arrays was calculated. We used a 
maximum distance threshold of 3000 bp to 
select protein operons associated with CRISPR 
arrays. Proteins were then redundancy reduced, 
and we then calculated a weighted naive score 
for each resulting 30% cluster. Divergent DRs 
were identified by searching for consensus DRs 
(identified from each cluster) within a 10-kbp 
window of each protein in the 30% cluster. The 
enhanced score was calculated in the same 
manner as the naive score, now using the 
searched DRs. 


E. coli PAM discovery assay 


Plasmids expressing the proteins and corres- 
ponding crRNA from the system of interest and 
containing a target 8N degenerate flanking 
library plasmid were transformed by electro- 
poration into Endura Electrocompetent EF. coli 
(Lucigen). After 12 to 16 hours, cells were 
scraped from transformant plates and mini- 
prepped to recover the resulting libraries, 
which were prepared and sequenced on an 
Illumina NextSeq. PAMs were extracted, and 
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Weblogos depicting PAMs depleted 5 standard 
deviations relative to the empty control were 
visualized using Weblogo3. 


Expression and purification of 
recombinant proteins 


E. coli codon optimized proteins and associated 
ncRNAs were expressed from IPTG-inducible T7 
promoters and purified with His14 or TwinStrep 
tags as specified using nickel or streptavidin 
affinity resin, respectively, using gravity flow 
columns. In some cases, purified proteins or 
RNPs were dialyzed overnight before use. 


Small RNA-seq 


Total RNA was extracted from native organisms, 
F. coli cultures containing plasmids encoding 
loci of interest, or affinity purified RNP com- 
plexes. The purified RNA was then subject to 
treatment with T4 PNK (NEB) and RNA 5’ ; 
polyphosphatase (Biosearch Technologies). After 
enzymatic treatments, purified RNA was sub- 
ject to library preparation with an NEBNext 
Multiplex Small RNA Library Prep kit (NEB) 
and sequenced on an Illumina MiSeq or 
NextSeq. 


In vitro cleavage assays 


Nucleic acid substrates were prepared by poly- 
merase chain reaction (PCR) with Cy3/Cy5 con- 
jugated oligos (IDT) as primers (dsDNA), ordered 
directly as Cy5-conjugated oligos (IDT) (ssDNA), 
or in vitro transcribed from PCR templates and 
labeled with pCp-Cy5 (Jena Biosciences) using 
T4 RNA ligase 1, ssRNA ligase (High Concentra- 
tion) (NEB) (RNA). Substrates were mixed with 
protein and buffer components and incubated 
at various temperatures, and results were re- 
solved by gel electrophoresis, as specified in 
the materials and methods. 


Mammalian genome editing 


Genome-editing experiments were performed 
in the HEK293FT cell line (Thermo Fisher 
Scientific). Cells were transfected with Lipo- 
fectamine 3000, and genomic DNA was har- 
vested 60 to 72 hours after transfection using 
QuickExtract DNA Extraction Solution (Lucigen). 
Target genomic regions were amplified by two 
rounds of PCR with NEBNext High Fidelity 2x 
PCR Master Mix (NEB) and sequenced on an 
Illumina MiSeq. Indel frequency was analyzed 
using CRISPResso2. 
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An extremely energetic cosmic ray observed by a 


surface detector array 


Telescope Array Collaboration* + 


Cosmic rays are energetic charged particles from extraterrestrial sources, with the highest-energy 
events thought to come from extragalactic sources. Their arrival is infrequent, so detection requires 
instruments with large collecting areas. In this work, we report the detection of an extremely energetic 
particle recorded by the surface detector array of the Telescope Array experiment. We calculate 


+51 


the particle’s energy as 244 + 29 (stat.) “36 (syst.) exa-electron volts (~40 joules). Its arrival direction 
points back to a void in the large-scale structure of the Universe. Possible explanations include a large 
deflection by the foreground magnetic field, an unidentified source in the local extragalactic 
neighborhood, or an incomplete knowledge of particle physics. 


lItrahigh-energy cosmic rays (UHECRs) 
are subatomic particles from extragalac- 
tic sources with energies greater than 
1 EeV (equal to 10° eV), which is about 
a million times as high as the energy 
reached by human-made particle accelerators. 
The origins of UHECRs are thought to be re- 
lated to the most energetic phenomena in the 
Universe, such as relativistic jets and outflows 
associated with black holes, gamma-ray bursts 
and relativistic flares of active galactic nuclei, 
or large-scale accretion shocks around clusters 
of galaxies (1). Alternatively, UHECRs might 
be produced by physics beyond the standard 
model of particle physics (2-4), though this 
possibility is constrained by upper bounds on 
the flux of ultrahigh-energy (UHE) photons 
(5, 6). The acceleration mechanisms of these 
particles are also unknown. Because cosmic rays 
are charged, they are deflected along their path 
to Earth by intervening galactic and extragalac- 
tic magnetic fields, so their arrival directions 
do not necessarily point to their sources. 
Although cosmic rays with energy >100 EeV 
have been observed (7), interactions with the 
cosmic microwave background radiation (CMB) 
(8) are expected to suppress the flux of UHECRs 
above 60 EeV (9, 10). This is because interac- 
tions between UHECR protons and the CMB 
can produce pions or induce photodisintegra- 
tion of heavy nuclei. The resulting break in 
the expected energy spectrum is known as the 
Greisen-Zatsepin-Kuzmin (GZK) cutoff (9, 10). 
This cutoff limits the origins of the highest- 
energy particles detected on Earth to sources 
with maximum distances of 50 to 100 Mpc, 
which have a short-enough path length for 
UHECRs to survive passage through the inter- 
galactic medium. At these distance scales, the 
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Universe is not homogeneous: Matter is con- 
centrated in a large-scale structure (LSS) com- 
posed of galaxy clusters, superclusters, filaments, 
and sheets, separated by intergalactic voids. A 
suppression of cosmic-ray flux at the highest 
energies consistent with the GZK cutoff has 
been observed (77-13). However, UHECRs with 
energies above the GZK cutoff are expected to 
be deflected less strongly by magnetic fields 
because of their higher kinetic energies, so their 
arrival directions are expected to be more closely 
correlated with their sources. 


The Telescope Array experiment 


At energies greater than 100 EeV, the flux of 
cosmic rays is less than one particle per cen- 
tury per square kilometer (72). This low flux 
can only be measured by an instrument with 
a collecting area of ~1000 km”. The energy, 
mass, and arrival direction of UHECRs can be 
reconstructed from the cascades of second- 
ary particles [an extensive air shower (EAS)] 
produced by their interaction with Earth’s 
atmosphere. Arrays of detectors, such as plas- 
tic scintillators or water-Cerenkov stations, can 
measure EASs when they reach the ground. 
The Telescope Array (TA) experiment is a 
cosmic-ray detector located in Utah, USA, at 
39.30° north, 112.91° west and 1370 m above 
sea level. It consists of a surface detector (SD) 
array with 507 stations arranged in a square 
grid. Each detector has two 3-m? layers of plastic 
scintillator that detect charged EAS particles. 
The stations are spaced by 1.2 km, giving a 
total effective area of 700 km? (14). The time- 
dependent response of the surface detectors 
is continuously monitored and calibrated by 
the detection of penetrating muons and elec- 
trons (1/4). The sky over the SD is viewed by 
fluorescence detectors, which directly mea- 
sure photons produced by the propagation of 
an EAS through the atmosphere, providing 
a calorimetric measurement of the shower 
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ray is estimated using the fluorescence de.-- 
tors by determining Xj,x, the atmospheric 
slant depth (in grams per square centimeter) 
at which an EAS deposits most of its energy. 
The Xax Observable is related to the mass 
composition by a statistical analysis; the TA 
cannot determine the particle mass on an 
event-by-event basis (16). The SD measurements 
carry indirect information about the primary 
composition, which is also determined on a 
statistical basis using machine learning (17). 

The arrival direction of a cosmic-ray particle 
is determined from the relative arrival times of 
the shower front at multiple SD stations, mea- 
sured by a time-synchronized global position- 
ing system (GPS) module mounted on each 
station. The particle density measured at a 
distance of 800 m from the EAS axis, Sgoo, is 
used as the energy indicator. Sgg9 is converted : 
to the primary energy of the cosmic ray as a 
function of zenith angle based on Monte Carlo 
EAS simulations (78). The SD energy scale has 
been calibrated to the calorimetric energy mea- 
sured by the fluorescence detectors with a 
factor of 1/1.27 (12). The resolution of the SD is 
1.5° in arrival direction and 15% in energy (12), 
with a systematic uncertainty of 21% (19). The 
detailed analysis procedure and event recon- 
struction are described in the supplementary 
materials (20). 


Energetic particle on 27 May 2021 


An unusually high-energy event was identi- 
fied during an arrival direction analysis (27) 
of all SD data taken between May 2008 and 
November 2021. This event triggered 23 de- 
tectors at the northwest region of the TA SD. 
The lateral density distribution (fig. S1) was 
used to determine Sgo9 and hence the energy 
of the primary cosmic ray. Following our 
analysis procedure (20), we determined that 
the event on 27 May 2021 had a reconstructed 
energy of 244 + 29 (stat.) *§ (syst.) EeV in the 
detector frame. This energy is ~4 x 10’ times 
as high as the ~’7-TeV protons produced by the 
Large Hadron Collider (LHC) (22). When this 
cosmic-ray particle experienced its first collision 
with a nucleon at rest in the upper atmosphere, 
the corresponding center-of-mass energy of the 
particle collision, assuming the particle was a 
proton, was ~700 TeV. 

Figure 1A shows a map of the TA SD sig- 
nals that it recorded for the high-energy cos- 
mic ray, including the footprint of the EAS on 
the TA SD. Figure 1B shows the recorded sig- 
nal size measured at each surface detector. 
Table 1 summarizes the reconstructed prop- 
erties of the event. The waveforms recorded 
by detectors at distances above 2 km contain 
many peaks from muons induced by the ha- 
dronic interactions. With so many muon com- 
ponents, the primary particle is unlikely to 
be a photon because EASs induced by photons 
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Labels indicate the detector number, total signal 
g particle (MIP), and the distance from the shower 


Relative time from earliest detector [1s] 


axis. Thick and thin lines (mostly overlapping) are the recorded signals in the upper and lower layers of each station. Each SD is identified by a four-digit number: 


The first two digits correspond 


to the column of the array in which the SD 


(numbered south to north). Co 


is located (numbered west to east), and the second two digits correspond to the row 
ors correspond to those in (A). UTC, coordinated universal time. 


Table 1. Reconstructed properties of the high-energy event. The reconstructed energy and Sgoo are given for the high-energy particle. The arrival 


direction is given in both the observed zenith-azimuth coordinates and the derived equatorial coordinates. The azimuth angle is defined to be anticlockwise 


from the east. The event time is expressed in UTC. 


Time (UTC) 
27 May 2021 10:35:56 


Energy (EeV) 


primarily consist of electromagnetic particles. 
We applied a neural network proton-photon 
classifier, developed for photoinduced shower 
searches using the TA SD (23, 24), to this event. 
The classifier excludes a photon as the pri- 
marty particle at the 99.986% confidence lev- 
el, instead favoring a proton as the primary 
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244 + 29 (stat.) +9. (syst.) 


Sgoo (m"?) 


530 + 57 


Zenith angle 
38.6 + 0.4° 


particle. However, the classifier is unable to 
distinguish between protons and heavier 
nuclei for this event because the fluorescence 
detectors were not operating at the time (owing 
to bright moonlight). 

The core position of this event was located 
1.1 km from the northwest edge of the SD (Fig. 
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Azimuth angle 
206.8 + 0.6° 


R.A. 
259.9 + 0.6° 


Dec. 
16.1 + 0.5° 


1A). We evaluate the statistical uncertainty 
of the reconstructed energy using a detector 
simulation (72) and assuming the reconstructed 
geometry and energy parameters; we find an 
energy resolution of 29 EeV for this event. 
Assuming an energy spectrum of E~** above 
100 EeV, as previously measured using the TA 
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Fig. 2. Arrival direction 


of the high-energy event Equatorial 
compared with potential coordinates 
sources. The arrival direc- wa 


tion of the 27 May 2021 
high-energy cosmic-ray 
particle (black circle) on a a 
sky map in equatorial f 
coordinates. Colored circles 
indicate calculated back- 
tracked directions 
assuming two models of 
the Milky Way regular 
magnetic field, labeled 
JF2012 (31) and PT2011 
(32). For each model, 
different symbols indicate 
the directions calculated 
for four possible primary 
species: proton (P; red), 
carbon (C; purple), silicon 


(Si; green), and iron (Fe; blue). The color bar indicates the relative flux expected 
from the inhomogeneous source-density distribution in the local LSS, smeared with a 
random Milky Way magnetic field. For comparison, nearby gamma ray-emitting 
active galactic nuclei are shown with filled diamonds and nearby starburst galaxies 
with filled stars, both with sizes that scale by the expected flux (38). The closest object 
to the proton backtracked direction in a gamma-ray source catalog (34) is the active 


SD (72), the migration effect (whereby lower 
energy showers are reconstructed with higher 
energies because of the energy resolution) is 
evaluated as -3%. We include an additional 
systematic uncertainty, owing to the unknown 
primary, of -10% in the direction of lower en- 
ergies, calculated from simulations (20). There 
was no lightning or thunderstorm activity re- 
corded in the vicinity of the TA site on 27 May 
2021 (25). 


Comparison with previous events 


Previously reported extremely high-energy cosmic- 
ray events include a 320-EeV particle in 1991 (26), 
a 213-EeV particle in 1993 (27), and a 280-EeV 
particle in 2001 (28). The 1991 event was mea- 
sured using fluorescence detectors, whereas 
the 1993 and 2001 events were both detected 
using surface detector arrays. All of these events 
were recorded by detectors in the Northern 
Hemisphere. A search in the Southern Hemi- 
sphere has not identified any events with en- 
ergy greater than 166 EeV (29), although there 
is an energy scale difference between the ex- 
periments (30). Although the event that we 
have detected was measured with a surface 
detector array, the reported energy of 244 EeV 
has been normalized to the equivalent energy 
that would have been measured with the TA 
fluorescence detector and is thus directly com- 
parable to the 1991 event. This normalization 
was performed because fluorescence detectors 
provide a direct, calorimetric measurement of 
the shower energy. The unnormalized TA SD 
reconstructed energy of 309 + 37(stat.) EeV 
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(20) is more appropriate for comparison with 
the 1993 and 2001 events. 


Possible sources of the cosmic ray 


Figure 2 shows the calculated arrival direc- 
tion of the 27 May 2021 event on a sky map in 
equatorial coordinates. The arrival direction is 
not far from the disk of the Milky Way, where 
the galactic magnetic field (GMF) is strong 
enough to substantially deflect even a parti- 
cle with an energy of 244: EeV, especially if the 
primary particle is a heavy nucleus with a 
large electric charge. The map also shows eight 
possible backtracked arrival directions, which 
we calculated (20) by assuming two GMF mod- 
els (31, 32) and four possible primary particles 
(proton, carbon nucleus, silicon nucleus, or iron 
nucleus). We used the backtracking method of 
a cosmic-ray propagation framework (33) to 
determine the arrival direction for the cosmic 
ray before it entered the Milky Way. 

We compared the arrival directions with a 
catalog of gamma-ray sources (34). We found 
that the active galaxy PKS 1717+177 is located 
within 2.5° of the calculated direction for a pro- 
ton primary. PKS 1717+177 is a flaring source 
(34); flaring sources have been proposed as 
potential cosmic-ray sources (35). However, 
its distance of ~600 Mpc (corresponding to a 
redshift of 0.137) (36) is expected to be too large 
for UHECR propagation to Earth because the 
average propagation distance at an energy of 
244: EeV is calculated to be ~30 Mpc for both pro- 
ton and iron primaries (20). We therefore dis- 
favor PKS 17171+177 as the source of this event. 
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galaxy PKS 1717+177. The dotted large circle centered around (R.A., Dec.) = (146.7° 
43.2°) indicates the previously reported TA hot spot (21). The dashed horizontal line 
indicates the limit of the TA field of view (FoV). The dotted circle centered around 
(R.A, Dec.) = (279.5% 18.0°) is the location of the Local Void (40). The galactic plane 
(G.P.) and the supergalactic plane (S.G.P.) are shown as solid and dotted curves, 
respectively. The Galactic Center (G.C.) is indicated by the cross symbol. deg., degrees. 


Figure 2 also shows the relative expected 
flux from an inhomogeneous source-density dis- 
tribution following the local LSS (37), weighted 
by the expected attenuation for a 244-EeV iron 
primary and smoothed to reflect the smearing 
resulting from turbulent magnetic fields in the 
Milky Way (20). Also shown are nearby gam- 
ma ray-emitting active galactic nuclei and star- 
burst galaxies, which have been proposed as 
possible cosmic-ray sources (38, 39). The ar- 
rival direction of this event is consistent with 
the location of the Local Void, a cavity between 
the Local Group of galaxies and nearby LSS fil- 
aments (40). There are only a small number of 
known galaxies in the void, none of which are 
expected sites of UHECR acceleration. Even 
considering the range of possible GMF deflec- 
tions and primary mass, we do not identify any 
candidate sources for this event. Only in the 
JF2012 GMF model and assuming an iron 
primary does the source direction approach a 
part of the LSS populated by galaxies. This 
backtracked direction is close to the starburst 
galaxy NGC 6946, also known as the Fireworks 
Galaxy, at a distance of 7.7 Mpc (41). However, 
NGC 6946 is not detected in gamma rays, so it 
is unlikely to be a strong source of UHECRs. 

If the energy of this event was close to the 
lower bound of its uncertainties, then the av- 
erage propagation distance is longer than we 
assumed in Fig. 2, and the deflection in the 
GMF would be larger (fig. S3). This effect would 
increase the number of possible source gal- 
axies, assuming a steady source (supplemen- 
tary text). For the alternative case of transient 
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Fig. 3. Arrival directions 
of all >100-EeV cosmic 
rays. Empty circles indicate 


Equatorial 
coordinates 


the arrival directions of Z 
all cosmic rays observed by PC 
TA SD over 13.5 years of 30.7 : 
operation that had energies ye or 

>100 EeV. The background ; 


and other symbols are 

he same as in Fig. 2. No 
lustering around the 
highest-energy event (thick 
circle) is evident. 
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sources, identifying a source is complicated 
by the time delays between electromagnetic 
radiation and charged particles because of the 
additional path lengths induced by magnetic 
deflection. We therefore cannot identify any 
potentially related transient sources. 

Nevertheless, the detection of this highly 
energetic particle allows us to estimate Do, the 
distance to the closest UHECR source (supple- 
mentary text). Assuming that the particle is an 
iron nucleus injected with an initial energy 
of Ep = 10° EeV, taking into account the en- 
ergy loss length estimated by the same prop- 
agation framework used in the backtracking 
method (42), we find Dp = 10.333 Mpc. Al- 
ternatively, assuming a proton primary, we 
find Do = 27.0135 Mpc. At these energies, 
the UHECR background of distant sources is 
attenuated by the energy loss length, so only 
sources from the local Universe can contrib- 
ute. We set upper limits on the deflection by 
assuming a maximum value of the turbulent 
extragalactic magnetic field B,.,, ~ 1nG and 
a 1-Mpc characteristic length scale, finding 
<20° for iron and <1° for proton. 


Distribution of other TA events 


Figure 3 shows the arrival directions for the 
28 TA SD events with energies >100 EeV ob- 
served between May 2008 and November 2021 
using the same event selection (27). The total 
exposure is 1.6 x 10* km’ sr year. No clustering 
with the highest-energy event is found. The 
244-EeV event came from a different direction 
than the TA hot spot, a 3.40 excess centered at 
right ascension (R.A.) 146.79, declination (Dec.) 
43.2°, that was previously identified for events 
with energies >57 EeV (21). 

Although we expected events with energies 
above 100 EeV to be clustered, the observed 
arrival directions above 100 EeV have an iso- 
tropic distribution (Fig. 3). The lack of a near- 
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by source for the 244-EeV event could be due 
to larger magnetic deflections than predicted 
by the GMF models, caused by a heavy pri- 
mary particle or stronger magnetic fields than 
in the models. Alternatively, super-GZK UHECRs 
could indicate an incomplete understanding 
of particle physics. If there are unknown types 
of primary particles that are immune to the 
interactions with the CMB, they could retain 
their energy while traveling to Earth from more- 
distant active galaxies. We cannot distinguish 
between these possibilities with the observed 
events. 


Summary and conclusions 


We detected a particle with an energy of 244 + 
29 (stat.) 1% (syst.) EeV on 27 May 2021. The 
arrival direction of this event does not align 
with any known astronomical objects thought 
to be potential sources of UHECRs, even after 
accounting for deflection by the GMF under 
various assumptions. Comparison with other 
observed events at energies above 100 EeV 
shows an isotropic distribution with no ap- 
parent clustering. 


REFERENCES AND NOTES 


1. A.M. Hillas, Annu. Rev. Astron. Astrophys. 22, 425-444 
(1984). 

2. P. Bhattacharjee, G. Sigl, Phys. Rep. 327, 109-247 (2000). 

3. F. W. Stecker, S. T. Scully, New J. Phys. 11, 085003 
(2009). 

4. V. Berezinsky, M. Kachelriess, A. Vilenkin, Phys. Rev. Lett. 79, 
4302-4305 (1997). 

5. R. U. Abbasi et al., Astropart. Phys. 110, 8-14 (2019). 

6. P. Abreu et al., Astrophys. J. 933, 125 (2022). 

7. J. Linsley, Phys. Rev. Lett. 10, 146-148 (1963). 

8. A.A. Penzias, R. W. Wilson, Astrophys. J. 142, 419-421 
(1965). 

9. K. Greisen, Phys. Rev. Lett. 16, 748-750 (1966). 

10. G. T. Zatsepin, V. A. Kuzmin, JETP Lett. 4, 78 (1966). 

11. R. U. Abbasi et al., Phys. Rev. Lett. 100, 101101 (2008). 

12. T. Abu-Zayyad et al., Astrophys. J. Lett. 768, L1 (2013). 

13. A. Aab et al., Phys. Rev. Lett. 125, 121106 (2020). 

14. T. Abu-Zayyad et al., Nucl. Instrum. Methods Phys. Res. A 689, 
87-97 (2013). 


ovyemb 


tips:// favxhim. se/blogs/hillO 


Relative expected flux [arbitrary units] 


15. H. Tokuno et al., Nucl. Instrum. Methods Phys. Res. A 676, 

54-65 (2012). 

16. R. U. Abbasi et al., Astrophys. J. 858, 76 (2018). 

17. R. U. Abbasi et al., Phys. Rev. D 99, 022002 (2019). 

18. D. Heck, J. Knapp, J. N. Capdevielle, G. Schatz, T. Thouw, 

“CORSIKA: A Monte Carlo code to simulate extensive air ¢ 

showers,” Forschungszentrum Karlsruhe Report, FZKA-6019 

(1998). 

19. R. U. Abbasi et al., Astropart. Phys. 80, 131-140 (2016). 

20. Materials and methods are available as supplementary 

materials. ‘ 

21. R. Abbasi et al., Astrophys. J. Lett. 790, L21 (2014). 

22. L. Evans, P. Bryant, J. Instrum. 3, SO8001 (2008). 

23. 0. Kalashev, |. V. Kharuk, M. Y. Kuznetsov, G. |. Rubtsov; 

elescope Array Collaboration, in Proceedings of 37th 

International Cosmic Ray Conference—PoS(ICRC2021) (2021), 

p. 864. 

24, |. Kharuk, O. Kalashev; Telescope Array Collaboration, in 

Proceedings of 37th International Cosmic Ray Conference—PoS 

(ICRC2021) (2021), p. 384. 

25. Lightning Exporter (Vaisala, 2023); https://lightning-exporter. 

vaisala.com. 

26. D. J. Bird et al., Astrophys. J. 441, 144 (1995). 

27. N. Hayashida et al., Phys. Rev. Lett. 73, 3491-3494 
(1994). 7 

28. N. Sakaki et al., in Proceedings of 27th International Cosmic 
Ray Conference (2001), pp. 333-336. 

29. P. Abreu et al., Astrophys. J. 935, 170 (2022). 

30. V. Verzi, D. Ivanov, Y. Tsunesada, Prog. Theor. Exp. Phys. 2017, 

2A103 (2017). 

31. R. Jansson, G. R. Farrar, Astrophys. J. 757, 14 (2012). 

32. M. S. Pshirkov, P. G. Tinyakov, P. P. Kronberg, 

K. J. Newton-McGee, Astrophys. J. 738, 192 (2011). 

33. R. Alves Batista et al., J. Cosmol. Astropart. Phys. 2016, 038 

(2016). 

34. S. Abdollahi et al., Astrophys. J. Suppl. Ser. 247, 33 (2020). 

35. G. R. Farrar, A. Gruzinov, Astrophys. J. 693, 329-332 (2009). 

36. D. Sowards-Emmerd, R. W. Romani, P. F. Michelson, 
S. E. Healey, P. L. Nolan, Astrophys. J. 626, 95-103 (2005). 

37. M. Y. Kuznetsov, P. G. Tinyakov, J. Cosmol. Astropart. Phys. 
2021, 065 (2021). 

38. A. Aab et al., Astrophys. J. Lett. 853, L29 (2018). 

39. R. U. Abbasi et al., Astrophys. J. Lett. 867, L27 (2018). 

40. R. B. Tully et al., Astrophys. J. 676, 184-205 (2008). 

Al. J. J. Eldridge, L. Xiao, Mon. Not. R. Astron. Soc. Lett. 485, 
L58-L61 (2019). 

42. 0. E. Kalashev, E. Kido, J. Exp. Theor. Phys. 120, 790-797 
(2015). 

43. T. Fujii, Data of an extremely energetic cosmic ray observed by 
a surface detector array of Telescope Array experiment, 
Zenodo (2023); https://doi.org/10.5281/zenodo.8427755. 


ACKNOWLEDGMENTS 
The Telescope Array experimental site became available 
through the cooperation of the Utah School and Institutional 


4 of 5 


RESEARCH | RESEARCH ARTICLE 


rust Lands Administration (SITLA), the US Bureau of Land 
Management (BLM), and the US Air Force. We appreciate the 
assistance of the State of Utah and the Fillmore offices of 

he BLM in crafting the Plan of Development for the site. P. A. Shea 
assisted the collaboration with valuable advice and supported the 
collaboration’s efforts. The people and the officials of Millard 
County, Utah, have been a source of steadfast and warm support 
or our work, which we greatly appreciate. We are indebted to 

he Millard County Road Department for their efforts to maintain 
and clear the roads. We gratefully acknowledge the contribution 
rom the technical staffs of our home institutions and the 
allocation of computer time from the Center for High Performance 
Computing at the University of Utah. We thank R. Cady for his 
long-standing contribution to the construction and operation of the 
detector and R. Mayta for her development of the event viewer 
ool. T. Fujii acknowledges insightful and productive discussions in 
he cosmic-ray group of Kyoto University and interdisciplinary 
communications in the Hakubi Center for Advanced Research, 
Kyoto University, and in the program for the Development of Next- 
generation Leading Scientists with Global Insight (L-INSIGHT). 
Funding: The Telescope Array experiment is supported by the 
Japan Society for the Promotion of Science (JSPS) through 
Grants-in-Aid for Priority Area 431, for Specially Promoted 
Research JP21000002, for Scientific Research (S) JP19104006, 
for Specially Promoted Research JP15H05693, for Scientific 
Research (S) JP15HO5741, for Science Research (A) JP18HO03705, 


Telescope Array Collaboration, Science 382, 903-907 (2023) 


for Young Scientists (A) JPH26707011, and for Fostering Joint 
International Research (B) JPI9KKOO74; by the joint research 
program of the Institute for Cosmic Ray Research (ICRR), the 
University of Tokyo; by the Pioneering Program of RIKEN for the 
Evolution of Matter in the Universe (r-EMU); by the US National 
Science Foundation awards PHY-1607727, PHY-1712517, PHY-1806797, 
PHY-2012934, and PHY-2112904; by the National Research Foundation 
of Korea (2017K1A4A3015188, 2020R1A2C1008230, and 
2020R1A2C2102800); by the Ministry of Science and Higher 
Education of the Russian Federation under the contract 075-15- 
2020-778; by IISN project no. 4.4501.18; by Belgian Science Policy 
under IUAP VII/37 (ULB); and by the Simons Foundation (00001470, 
NG). The Telescope Array was partially supported by the grants 
of the joint research program of the Institute for Space-Earth 
Environmental Research, Nagoya University; the Inter-University 
Research Program of the Institute for Cosmic Ray Research of the 
University of Tokyo; and by the foundations of Ezekiel R. Dumke and 
Edna Wattis Dumke, Willard L. Eccles, George S. Eccles, and Dolores 
Doré Eccles. The State of Utah supported the Telescope Array 
through its Economic Development Board, and the University of Utah 
through the Office of the Vice President for Research. Author 
contributions: T. Fujii identified the event, performed data analyses, 
and wrote the manuscript. R. Higuchi assisted the backtracking 
calculation, T. Sako performed the Monte Carlo simulations, 

M. Y. Kuznetsov calculated the relative expected flux, and |. Kharuk 
performed the proton-gamma ray classification. J. N. Matthews, 


= HBS y/avxtim.se/blogs/hill0 


P. Sokolsky, G. B. Thomson, H. Sagawa, C. H. Jui, S. Ogio, Y. Tsunesada, 
S. V. Troitsky, G. |. Rubtsov, P. G. Tinyakov, J. H. Kim, K. Fujita, and 

N. Globus discussed the results and commented on the manuscript. 
Other authors contributed to the detector construction, deployment, 
long-term data—taking and maintenance, software development, 

or review of the manuscript. All authors meet the journal's authorship 
criteria. Competing interests: There are no competing interests 

to declare. Data and materials availability: Raw data for this event 
and the analysis code that we used are archived at Zenodo (43). 
License information: Copyright © 2023 the authors, some 

rights reserved; exclusive licensee American Association for the 
Advancement of Science. No claim to original US government works. 
https://www.science.org/about/science-licenses-journal-article-reuse 


SUPPLEMENTARY MATERIALS 


science.org/doi/10.1126/science.abo5095 
Telescope Array Collaboration Authors 
Materials and Methods 

Supplementary Text 

Figs. Sl to S3 

References (44-50) 


Submitted 8 February 2022; resubmitted 6 December 2022 
Accepted 19 October 2023 
10.1126/science.abo5095 


5 of 5 


RESEARCH 


HEAVY FERMIONS 


Shot noise in a strange metal 
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S trange-metal behavior has been observed in materials ranging from high-temperature superconductors 
to heavy fermion metals. In conventional metals, current is carried by quasiparticles; although it has 
been suggested that quasiparticles are absent in strange metals, direct experimental evidence is lacking. 
We measured shot noise to probe the granularity of the current-carrying excitations in nanowires 

of the heavy fermion strange metal YbRh2Siz. When compared with conventional metals, shot noise in 
these nanowires is strongly suppressed. This suppression cannot be attributed to either electron-phonon 
or electron-electron interactions in a Fermi liquid, which suggests that the current is not carried by 
well-defined quasiparticles in the strange-metal regime that we probed. Our work sets the stage 

for similar studies of other strange metals. 


trange metals are non-Fermi liquids that 
exhibit an electrical resistivity that ex- 
hibit a low temperature (T) electrical 
resistivity contribution that is directly 
proportional to T (1). This response has 


port of ordinary “granular” charge carriers 
of magnitude e with an average current (J). 
Shot noise has revealed fractionalization of 
charge in the fractional quantum Hall liquid 
(20, 21), fractional effective charges in quan- 


been reported across many materials families, 
including cuprate (2-4) and pnictide (5) super- 
conductors, ruthenates (6), heavy fermion me- 
tals (7-9), and twisted bilayer graphene (J0). 
Strange-metal properties typically arise at finite 
temperature above a quantum critical point 
(QCP), often in proximity to antiferromagnetic 
order (11). There are two broad classes of the- 
ories on metallic QCPs. Within the standard 
Landau approach of order parameter fluctua- 
tions, quasiparticles retain their integrity (12, 13). 
By contrast, in approaches beyond the Landau 
framework (/4-18), no long-lived quasiparti- 
cles are expected to remain. Thus, determining 
the nature of the low-energy current-carrying 
excitations is an important means to elucidate 
the nature of strange metals near QCPs. 

How can we determine whether the current 
carriers in strange metals are quasiparticles? 
Shot noise in electrical conduction (19) is a 
distinctive probe of mesoscopic systems in 
which the current noise, S; = ((I — (I))”), in 
a system driven out of equilibrium accesses 
the nature of the charge-carrying excitations. 
Here, J is the instantaneous current and (J) is 
the average current. The Fano factor, F, gives 
the ratio between the measured noise S; and 
2e(), the expectation for Poissonian trans- 
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tum dot Kondo systems (22, 23), and pairing 
in superconducting nanostructures in the nor- 


( 


mal state (24, 25). A lack of granular quasi Chee 

. ms upd 
ticles would naively be expected to supp-—— 
shot noise, because the flow of a continuous 
fluid should have no fluctuations. 

Despite their ubiquity, strange metals have 
yet to be examined through shot-noise mea- 
surements for several technical reasons, and 
only a few relevant theoretical predictions ex- 
ist for any quantum critical systems (26, 27). In 
many materials, strange metallicity is cut off at 
low temperatures by the onset of superconduc- 
tivity, which complicates matters because shot- 
noise measurements also require an electrical 
bias eV, where ¢ is the charge of the electron and 
V is the applied voltage, that is large compared 
with the thermal scale Xg7 to distinguish from 
thermal noise, where i is the Boltzmann constant. 
Tunneling transport into a strange metal faces 
the challenge that only discrete, individual elec- 
trons can be added or removed, which likely leads 
to noise dominated by single-electron effects. 
Fortunately shot noise can be measured within a 
material using a diffusive mesoscopic wire, which 
requires the nanofabrication of such structures 
without affecting electronic properties—a ma- 
jor challenge for many materials. 
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Fig. 1. YbRh2Siz nanowire device preparation and characterization. (A) YbRh2Siz nanowire between two large- 
area, thick-sputtered Au contacts on top of the unpatterned YbRh,Siz film, which were deposited to ensure that 
the measured voltage is dominated by the nanowire. (B) Higher-magnification view of the indicated region in (A). 
Sample fabrication is discussed in detail in (33). (©) Normalized resistance as a function of temperature for both the 
unpatterned film and the etched nanowire. The inset shows that resistivity in the low-temperature limit in both the film 
and the wire is linear in temperature (with black dashed line as a linear-in-temperature guide to the eye), as seen 
previously (31). Unpatterned film resistance at 100 K is 17.8 ohms. Nanowire resistance at 100 K is 164.7 ohms. 

(D) Normalized resistance as a function of the in-plane magnetic field for both the unpatterned molecular-beam epitaxy 
film and the etched nanowire (magnetic field B is oriented transverse to the nanowire), with curves shifted vertically for 
clarity. Zero-field resistances for the film at 10, 7, 5, and 3 K (top to bottom) are 6.5, 5.5, 4.8, and 4.1 ohms, 
respectively. Zero-field resistances for the wire at 10, 7, 5, and 3 K are 57.8, 49.0, 42.2, and 35.5 ohms, respectively. The 
nearly identical response between the nanowire and the unpatterned film confirms that patterning did not substantially 
alter the electronic properties of the epitaxial YbRh Siz material and that the resistance is dominated by the wire. 
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Fig. 2. Noise characteriza- A 
65 


tion of a YbRh2Siz nanowire. 
(A) Differential resistance 
dV/di as a function of bias 
current at 10, 7, 5, and 3 K 
(top to bottom). Comparison 
with theoretical shot-noise 
expectations requires this 
information (see Eqs. 1 and 2). 
(B) Averaged voltage noise 
spectra (with zero-bias spectra 
subtracted) of a YbRh2Sis 
nanowire device at different 
bias levels at T = 3 K, over a 
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these voltage noise spectra are analyzed (Eq. 2) to determine the shot noise at each bias. Each spectrum shown is an average of 4500 spectra with 10-kHz bandwidth. 


We have successfully made mesoscopic wires 
for noise measurements from epitaxial films of 
the heavy fermion material YbRh,Sis, a par- 
ticularly well-defined strange metal (9, 28). 
YbRh,Si, has a zero-temperature magnetic field- 
induced continuous quantum phase transition 
from a low-field antiferromagnetic heavy Fermi 
liquid metal to a paramagnetic one. The Hall 
effect displays a rapid isothermal crossover 
that extrapolates to ajump at the QCP in the 
zero-temperature limit, which provides evi- 
dence for a sudden reconstruction of the Fermi 
surface across the QCP and an associated change 
in the nature of the quasiparticles between 
the two phases (29), as expected in the Kondo 
destruction description (14-16) for a beyond- 
Landau QCP. At finite temperatures, a quan- 
tum critical fan of strange metallicity extends 
over a broad range of temperature and mag- 
netic field (28, 30). Recent time-domain THz 
transmission measurements (37) of the optical 
conductivity of epitaxial films of YbRhoSi, re- 
veal the presence of quantum critical charge 
fluctuations below 15 K, supporting the Kondo 
destruction picture in this system. 

Measuring shot noise in YbRh,Si, wires di- 
rectly examines how current flows in a system 
thought to lack discrete charge excitations; 
these results can then be compared with predic- 
tions in Fermi liquids. We report measurements 
of shot noise in mesoscopic wires patterned 
from epitaxial films of YbRh,Si,, examined 
below 10 K, in the strange-metal regime where 
phonon scattering is not expected to be relevant 
to the conductivity. The measured shot noise 
is found to be far smaller than both weak- and 
strong electron-electron scattering expectations 
for Fermi liquids and also smaller than the val- 
ues measured on a gold nanowire for compari- 
son. Furthermore, the electron-phonon coupling 
that was determined experimentally using long 
YbRh,Siy nanowires rules out strong electron- 
phonon scattering as a noise-suppression mech- 
anism. Therefore, the suppressed shot noise is 
evidence that current-carrying excitations in this 
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strange metal defy a quasiparticle description 
in the examined temperature range. The no- 
quasiparticle model described in previous work 
(26, 27), despite being derived using conformal 
field theory for different kinds of QCP and as- 
sociated phases than those of YbRh2Sis, predicts 
nontrivial bias- and temperature-dependent 
noise that is qualitatively consistent with the 
observed trends. 


Measuring shot noise in YbRh2Siz devices 


High-quality epitaxial films of YbRh2Siz were 
grown by molecular beam epitaxy on germa- 
nium substrates (31, 32) [see section 3 in (33) 
for details]. The temperature dependence of 
the resistivity of these films above 3 K shows 
strange-metal properties (p = pg + AT”, where 
a = 1 in the low temperature limit, p is is the 
electrical resistivity, and A is the temperature 
coefficient) as in the bulk material (Fig. 1C). 
The films are patterned into nanowires through 
a combination of electron-beam lithography 
and reactive ion etching (see fig. S2 and the ac- 
companying discussion). The nanowire shown 
in Fig. 1B is 60 nm thick, 660 nm long, and 
240 nm in width. Thick source and drain con- 
tact pads ensure that the dominant voltage 
measured under bias is across the nanowire; 
the contact pads also act as thermal sinks (34). 
An important concern in fabricating nano- 
structures from strongly correlated materials is 
that the patterning process does not alter the 
underlying physics. As shown in Fig. 1C, the 
resistance R(T) of the nanowire closely matches 
that of the unpatterned film, including a domi- 
nant linear-in-T dependence at low tempera- 
tures. Similarly, in Fig. 1D, the magnetoresistance 
(field in-plane, perpendicular to the current) 
in the nanowire is nearly identical to that of 
the unpatterned film, showing that the fabrica- 
tion process did not alter the material’s proper- 
ties. This consistency also shows that the total 
R is dominated by the wire, because the large 
contacts are coated in thick gold and would 
not exhibit such a magnetoresistance. Three 
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nanowires patterned from this same film all _ 
show essentially identical transport and noise 
properties (data from devices 2 and 3 are 
shown in fig. S5). 

The noise-measurement technique is well de- 
veloped (21, 24, 34). A current bias is applied to 
the device by means of a heavily filtered voltage 
source and ballast resistors. Using a custom 
probe, the voltage across the device is measured 
through two parallel sets of amplifiers and a 
high-speed data acquisition system (fig. S1C). 
The time-series data are cross-correlated and 
Fourier transformed to yield the voltage noise Sy 
across the device, with the correlation mitigating 
the amplifier input noise [see sections 1 and 2 of 
(33) for a detailed discussion of calibration and 
averaging]. Figure 2A shows the variation of the 
differential resistance, dV/dJ, as a function of 
bias current, whereas Fig. 2B gives examples of 
voltage noise spectra at several bias currents 
at a base temperature of 3 K. At the maxi- 
mum bias currents that were applied, the 
voltage drop across the wire is several mV, a 
bias energy scale that considerably exceeds 
kT (0.25 meV at 3 K), as is needed for shot 
noise measurements. 


Theoretical expectations for the shot noise 
and Fano factor 


To understand the measured noise in YbRh2Sis, 
we first considered the expected current shot 
noise result for a diffusive metallic constriction. 
This is a long-established calculation within the 
Landauer-Biittiker formalism (35-39) for a Fermi 
gas (i.e., without any electron-electron interac- 
tions). A metal with well-defined quasiparticles is 
assumed in the source and drain, which obey the 
Fermi-Dirac (FD) distribution with a tempera- 
ture set by the contacts, 7p. In the noninteracting, 
nanoscale limit, conduction takes place through 
spin-degenerate quantum channels with various 
transmittances, t;. Each channel contributes to S; 
by an amount proportional to t,(1 — 1;). By av- 
eraging over the distribution of transmittances 
(35-38), one finds a predicted Fano factor 
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Fig. 3. Noise versus bias current characteristics. (A) Noise versus bias 
current for a YbRh2Sis wire at 10, 7, 5, and 3 K (top to bottom), with fits to 
Eq. 1 to extract effective Fano factors. Error bars are the standard deviation from 
15 repeated bias-sweep measurements. Also shown for illustrative purposes 
are expectations for F = 3/4, 1/3 and 0 (indicated by the gray dot-dashed 
curves from top to bottom, respectively), which were calculated by using the 
measured differential resistance at each temperature and Eq. 2. At all 


F = S,/2e(I) = 1/3. When inelastic electron- 
electron scattering is added to the otherwise 
noninteracting Fermi system, such that the 
system size along the direction of the current 
exceeds the electron-electron scattering length, 
L > Lee (See fig. S11) but is smaller than the 
electron-phonon scattering length, Lpn, there is 
a redistribution of energy and effective ther- 
malization among the carriers (40, 41). There 
is a local quasi-thermal FD distribution within 
the wire described by a local electronic temper- 
ature T.(”) that is higher than the lattice tem- 
perature, To = T;, which is assumed to be uniform 
and equal to the temperature of the contacts. 
This approach leads to a prediction of F = 
V3/4 = 0.433 (40, 41). Fano factor predic- 
tions in both L < Lee < Lpp and Lpp > L > Lee 
limits have been confirmed in experiments in 
mesoscopic metal wires (34, 42-44). 

In the present context, an important ques- 
tion is what happens in a Fermi liquid state 
when the electron-electron interactions are so 
strong that the quasiparticle weight is orders 
of magnitude smaller than the noninteract- 
ing case (equal to 1) for a free electron (though 
still nonzero) and the Landau parameters are 
correspondingly large. It was recently shown 
that charge conservation constrains the Fano 
factor to be independent of the quasiparticle 
weight, and the combination of instantaneous 
electronic interactions and Poissonian charge 
transport dictates that the shot noise and av- 
erage current get renormalized identically by 
the Landau parameters (45). As a result, for 
this regime (see solid line in fig. S11) that per- 
tains to a strongly correlated Fermi liquid of 
interest here, the Fano factor would be F = 


Chen et al., Science 382, 907-911 (2023) 


24 November 2023 


J3 /4 = 0.433. [For further details, see section 
13 of (33) and (45).] 

Within the Fermi liquid quasiparticle pic- 
ture, the only way to suppress shot noise below 
these levels is through strong electron-phonon 
scattering, which perturbs the electronic dis- 
tribution function. In the limit of very strong 
electron-phonon coupling, the electronic dis- 
tribution is constrained to be in equilibrium 
with the lattice temperature, 7>, and only 
Johnson-Nyquist noise at J) remains. 


Comparison to theoretical expectations 


Figure 3 shows the measured voltage noise as 
a function of bias current for a YbRh,Si, nano- 
wire, and its counterpart for a gold nanowire 
for comparison. Shown as gray dot-dashed lines 
are the F = 1/3 expectations based on the mea- 
sured differential resistance, dV /d/. Indepen- 
dent of any detailed analysis, the measured noise 
in the YbRhoSig wire is clearly suppressed 
well below the Fermi liquid expectation at all 
temperatures. Additional data on two more 
wires (devices 2 and 3) are essentially iden- 
tical [see section 7 of (33) and fig. S5]. By 
contrast, the gold nanowire data [discussed 
further in section 9 of (33) and fig. S7] are 
consistent with Fermi liquid predictions, with 
a slight suppression of the noise above 10 K as 
electron-phonon scattering becomes relevant 
(fig. S7D). 

The electron-phonon coupling may be ex- 
tracted experimentally by analyzing the noise 
as a function of bias in a wire sufficiently long 
that electron-phonon scattering is dominant 
(42). As detailed in section 5 of (33), we per- 
formed this analysis using a 30-m-long YbRh,Siy 
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temperatures, the measured voltage noise is far below the theoretical expect- 
ations for shot noise in a diffusive nanowire of a Fermi liquid, even in the weak 
electron-electron scattering limit. (B) Analogous data for a gold wire over the 
same temperature range, as discussed in section 9 of (33). Data here are much 
closer to the F = 1/3 Fermi liquid expectation, with the small deviation at the 
highest temperatures being attributed to electron-phonon scattering effects. 
Error bars are the standard deviation from 15 repeated bias-sweep measurements. 


wire to determine the effective electron-phonon 
coupling in this material and found a value 
sufficiently small that strong electron-phonon 
scattering is ruled out as a mechanism for sup- 
pressing the noise in the much shorter YbRh2Siy 
nanowire constrictions. 

Extracting effective Fano factors from the 
measured noise requires analysis in terms of 
finite temperature expressions for the shot noise. 
Subtleties about thermal noise can arise when 
the device is non-ohmic, as discussed in sec- 
tion 10 of (33), but corrections from the ohmic 
case are small for the measured nonlinearities 
shown in Fig. 2A. The expected form for the 
current shot noise in an ohmic system with Fano 
factor F and differential resistance dV /dJ is (19) 


eV av\ 
S; = F - 2e(I)coth + (1— F)4kgT 
: nes (sez) a (Sr) 


(1) 


This expression reduces to the Johnson-Nyquist 


current noise S; ~7y = 4k re in the zero- 


bias limit and becomes S; z F - 2e(I) as ex- 
pected in the high-bias limit eV>kgT. In the 
experiment, we measured voltage noise, and, for 
ease of comparison, we subtracted off the zero- 
bias Johnson-Nyquist noise so that effective 
Fano factors may be estimated by fitting to the 


voltage-based expression for the shot noise: 


av\? eV 
Sy = (——) |F- 2e(F)coth 
«= (ar), [F 2eenenin sr) * 
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SJ 
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e-e scattering strong 


I(t) 
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e-e scattering weak 


8 10 
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Fig. 4. Fano factors and context for their interpretation. Fano factors found 
from fitting the data in Fig. 3 are shown. Error bars are the standard error 
from fitting 15 repeated bias sweep measurements. In a Fermi liquid, current is 
carried by individual quasiparticle excitations, and the current as a function of 
time fluctuates with the arrival of each discrete transmitted carrier. Carriers 
scatter diffusively through static disorder (brown dots). When electron-electron 
scattering is weak (sample length L < L,.), the expected Fano factor is F = 1/3 


The fitted Fano factors of a YbRh»Si,g device 
and a gold nanowire device are shown in 
Fig. 4, which provides a direct comparison 
between YbRh,Si, and a Fermi liquid diffu- 
sive wire. Corrections stemming from the non- 
ohmic response lead to lower inferred Fano 
factors (fig. S8). 

Our detailed thermal modeling of the present 
system under the standard Fermi liquid assump- 
tions [see section 6 of (33)] confirms that, includ- 
ing electronic thermal transport through the 
Wiedemann-Franz relation and the measured 
electron-phonon coupling, F = /3 /4 would 
be expected in the present high-bias limit. This 
is in sharp contrast to the experimental data 
shown in Fig. 3A. Experiments on bulk YbRh2Si, 
crystals do not show large deviations from the 
Wiedemann-Franz relation in this temperature 
range (46, #7). Electronic transport measure- 
ments and thermodynamic measurements of 
YbRh,Si, in this temperature regime, as well 
as THz optical conductivity measurements (37) 
in these films, show that phonons are not con- 
tributing strongly to the electronic properties 
in YbRh,Si, below 15 K. As shown in section 6 of 
(33) and fig. S4, the measured electron-phonon 
coupling in YbRh,Si, is too small by more than 
a factor of 35 to be responsible for the observed 
noise suppression. 

To interpret these results, it is important to 
consider the nature of quasiparticles in terms 
of the single-particle spectral function and 
distribution functions. For a Fermi gas, the 
single-particle spectral function A(x, €) at a 
given wavevector x is a delta function in en- 
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vanish (red mark). Dashed 


ergy € at e = E,, where E;, is the quasiparticle 
energy as a function of %, meaning that a par- 
ticle excitation at (, E;,) in the zero tempera- 
ture limit is perfectly well defined in energy and 
has an infinite lifetime with a spectral weight 
Z = 1. Correspondingly, the particle excitations 
follow the FD distribution, and the Fermi sur- 
face is a perfectly sharp boundary at T = 0. In 
a Fermi liquid, the spectral function retains a 
peak for & near the Fermi surface, which de- 
scribes a quasiparticle with a nonzero spectral 
weight Z < 1. The distribution function near 
the Fermi surface is smeared but still has a 
nonzero discontinuity at T = 0 K (48). 

In the case of the particular type of non- 
Fermi liquid with a complete destruction of 
quasiparticles, one has Z = 0 everywhere on the 
Fermi surface. With such a complete smearing 
of the Fermi surface, there is no discontinuity 
in the distribution function even at T = 0 K. 
In this limit, when driven by a bias that does 
not greatly perturb the non-FD distribution 
function, there are no granular quasiparticles 
that carry the electrical current. We can then 
expect a much-reduced shot noise, as we observe 
in the form of a Fano factor that is consider- 
ably smaller than not only the strong electron- 
electron scattering expectation F = V3 /4 but 
even the weak electron-electron scattering 
counterpart F = 1/3. We highlight this contrast 
in Fig. 4 and discuss it further in section 13 of 
(33). For reference, in the extreme case when 
the electron spectral function at any given k as 
a function of energy is entirely featureless, the 


continuous electron fluid would have no shot 
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current-carrying state without 
dispersive excitations 


I(t) 


(green mark in the graph), whereas electron-phonon coupling can suppress 
this at higher temperatures. 
the expected Fano factor is F = 3/4 (blue mark in the graph). In a system 
without well-defined quasiparticles, charge transport is more continuous, 
which leads to suppressed current fluctuations; in the extreme limit, where 
electronic excitations are entirely nondispersive, the Fano factor is expected to 


When electron-electron scattering is strong (L > Lee), 


lines are guides to the eye. 


noise at all (F'= 0). Interestingly, one approach to 
a quantum critical system with no quasiparticles 
(26, 27) predicts nonzero noise with a trend in 
bias and temperature that is quantitatively sim- 
ilar to that shown in Fig. 3A (as seen in fig. S10), 
though that model is based on a different form of 
quantum criticality (a superconductor-insulator 
transition) than that in YbRh,Siy. This is dis- 
cussed further in section 12 of (33). 


Discussion and outlook 


Shot noise is a probe that gives special access to 
the nature of charge carriers. The suppressed 
noise shown in Fig. 3A and summarized in Fig. 4 
is evidence that current in this strange-metal 
regime is not governed by the transport of 
individual, granular quasiparticles. A Fano fac- 
tor of zero is expected only for the most ex- 
treme case of a non-Fermi liquid that has a 
completely flat spectral function. A non-Fermi 
liquid that still has residual dispersive spec- 
tral features, despite a vanishing quasiparticle 
weight Z, would lead to a nonzero Fano factor. 
Any residual dispersive spectral features are 
naturally expected to somewhat sharpen as 
T — OK, leading to a rise in F as temperature is 
lowered, but, in that case, would never reach the 
F = 3/4 expectation for a strongly correlated 
Fermi liquid (where Z is finite). The present ex- 
periment takes place firmly in the non-Fermi 
liquid regime (down to nearly a factor of 10 be- 
low the effective single-ion Kondo temperature) 
that is already seen to exhibit critical scaling of 
the optical conductivity (3D. As discussed further 
in section 13 of (33), an effective Fermi liquid 
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SULFUR CYCLE 


Deconvolving microbial and environmental controls 
on marine sedimentary pyrite sulfur isotope ratios 


R. N. Bryant®2*, J. L. Houghton’, C. Jones?, V. Pasquier’, I. Halevy®, D. A. Fike? 


Reconstructions of past environmental conditions and biological activity are often based on bulk stable 
isotope proxies, which are inherently open to multiple interpretations. This is particularly true of 

the sulfur isotopic composition of sedimentary pyrite (8°*Spyr), which is used to reconstruct ocean- 
atmosphere oxidation state and track the evolution of several microbial metabolic pathways. We present 
a microanalytical approach to deconvolving the multiple signals that influence 5 Soyn yielding both 
the unambiguous determination of microbial isotopic fractionation (¢,,j,) and new information about 
depositional conditions. We applied this approach to recent glacial-interglacial sediments, which feature over 
70%o variations in bulk 8 Sie across these environmental transitions. Despite profound environmental 
change, €mic remained essentially invariant throughout this interval and the observed range in 5 Si: was 
instead driven by climate-induced variations in sedimentation. 


he sedimentary pyrite sulfur isotope 
record is a robust archive of past bio- 
geochemical cycling, yet considerable un- 
certainty remains about how to interpret 
it because it is a complex mixture of bio- 
logical and environmental forcings and can be 
influenced at multiple stages of mineral pre- 
cipitation during and after deposition. Despite 
a much earlier body of work suggesting that 
the sulfur isotopic composition of sedimentary 
pyrite (8*Spyn) is affected by local environmental 
conditions (J-5), stratigraphic variations in 
5 Siyr have been interpreted to reflect changes 
in the global biogeochemical sulfur cycle, such 
as changes in the sulfur isotope signature of 
marine sulfate (8°*S.uirate) (6-8), or the ope- 
ration of specific microbial pathways (e.g., sul- 
fate reduction, disproportionation, or sulfide 
oxidation) (9-11). Recently, after a gap of sev- 
eral decades, workers have begun to focus once 
more on how local environmental conditions 
modulate the 5**S signature that is ultimately 
preserved in sedimentary pyrites (12-17). 


5°“S, yr Values are locally controlled 


The importance of depositional conditions on 
stratigraphic variation in 8S ovr was high- 
lighted by the identification of large-amplitude 
(>70%o) fluctuations in 8**S,,, correlating with 
~100,000-year Pleistocene glacial-interglacial 
cycles in the PRGL1-4 core in the Gulf of Lion 
in France (15) (Fig. 1A). Because of the long 
(13 million years) residence time of sulfate in 
the ocean (78), such rapid, large fluctuations in 
3°“Spyr could not reflect changes in seawater 
5°**S.utfate- Instead, relative enrichments in *“S 
of pyrite from glacial sediments with respect 
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to interglacials were hypothesized to have re- 
sulted from changes in either biological or envi- 
ronmental forcings. Potential biological drivers 
for more positive 5*S,,, include increases in 
the cell-specific sulfate reduction rate or de- 
creases in the presence of microbial dispro- 
portionation, resulting in a smaller net isotopic 
fractionation (€,,;-) between pore water sulfate 
and microbially produced sulfide (17, 19-22). 
Alternatively, an increased sedimentation rate 
during glacial intervals could have increased 
the importance of sulfate supply from the over- 
lying water column by burial of sulfate in pore 
water relative to the replenishment of sulfate 
by diffusion, thereby decreasing the “openness” 
of the pore water system (12, 23, 24). In other 
words, changes in the 5*4S,,, signal could arise 
from two broad classes of explanations, relying 


A 3°E ae 
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either on inherent changes to microbial ¢ Chee 
munity structure or metabolic activity ob —— 
externally forced changes in ambient environ- 
mental parameters. The former could be driven 
by changing organic carbon and nutrient 
availability between glacial and interglacial 
environments and the latter by changes in 
sedimentation rate and porosity and perme- 
ability associated with varying riverine flux of 
sediments driven by glacioeustatic sea-level 
changes. In general, both types of information 
(biological and environmental) are of extreme 
interest to those trying to reconstruct past 
changes in Earth’s surface environment. How- 
ever, to complicate matters, increased sed- 
imentation can drive changes in the quantity 
and lability of organic carbon reaching the 
zone of sulfate reduction (25, 26) and in the 
degree of sulfide oxidation, where the result- 
ing increases in net sulfate reduction would : 
steepen isotope gradients within sediments. 
Perhaps unsurprisingly, then, it has not yet 
been possible to distinguish between these fun- 
damentally different (microbial versus environ- 
mental) classes of explanations in driving the 
observed variation in the bulk sedimentary 
5°*Spyr record from the Gulf of Lion (J5) or in- ‘ 
deed other similar records from throughout 
Earth’s history (0, 11, 27). 

Here, we demonstrate an approach to de- 
convolving the multiple ecological and envi- * 
ronmental components that contribute to the 
aggregate BS ie signature. Specifically, we 
used secondary ion mass spectrometry (SIMS) 
scanning ion imaging (28-30) to analyze the 
sulfur isotope composition of numerous indi- 
vidual pyrite grains (Fig. 1B) separated from 
discrete sediment samples (37). Scanning ion 


SE B Polished epoxy puck 


Fig. 1. Study site map and example of mounted sedimentary pyrite. (A) Location of core PRGL1-4 
(red star) in the Gulf of Lion in the northwest Mediterranean. The thick black line marks the 
approximate location of the coastline during the last glacial maximum. Inset shows the position of 
the Gulf of Lion in southern Europe. [Service layer credits: Copyright EMODnet Bathymetry 2015] 
(B) Photomicrograph of Gulf of Lion sediment pyrite extracts mounted in a 2.5-cm-diameter 

epoxy puck and polished. Inset shows a close-up of an epoxy-mounted pyrite grain after rastering 


during SIMS scanning ion imaging. 
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imaging involves the bombardment of raster 
squares with an ~10 pA, <1-um-diameter Cs* 
beam, with an electron multiplier used to as- 
sign counts of ?°S” and *“S” to pixels of an im- 
age (29). This method was applied to pyrites 
from five samples spanning the large-amplitude 
stratigraphic fluctuations in 8°**S,,, observed 
across one cycle of glacial retreat and subse- 
quent advance [147 to 65.3 thousand years (ka)] 
in the Gulf of Lion (15). As pyrites grew within 
the sediments, they collectively continued to 
sample the ambient pore water sulfide pool 
and reflect its evolving isotopic composition 
(14, 29, 32). A time series of pore water sulfide 
8°*“S through the sediment burial process (Fig. 
2A) is therefore expected, given sufficient iron 
availability (33), to be recorded in the popula- 
tion of pyrites in a given sample (Fig. 2B) (29). 
In this case, the maximum offset between 
3°48 yr and seawater 8°*S..1fate Would repre- 
sent the apparent €,)j-, and the overall spread 
of 5°*S,y. to higher values would reflect the 
degree of closed-system evolution of pore 
water sulfide during pyrite formation. There- 
fore, this approach can yield both biological 
and environmental information that can aid 
interpretation of bulk 5°4S,,,, data. 


Environmental conditions, over <,j-, drive 
modern bulk 5°*S,y, variations 


Ranges in grain-specific 5 Siyr data (Fig. 3A 
and table S1) overlapped with corresponding 
bulk 5°*S,yr values in all cases. The lower ex- 
tremes of the five sampled intervals all fell be- 
tween —60.2 + 1.0%o and —48.0 + 0.9%o, whereas 
the upper quartiles and upper extremes were 
more positive for samples with more positive 
bulk 5 Sor values. Glacial pyrites displayed a 
weak negative correlation between apparent 
grain size and 8*4S,,, values (nm = 141, p = 
-0.32, P = 0.00014; fig. SI), whereas there 
was no significant correlation for interglacial 
pyrites (n = 68, p = 0.1, P = 0.39; fig. S1). For 
8 Sve data normalized by grain area, the dif- 
ference in the mean values for glacial and 
interglacial grains was statistically significant 
(ANOVA test; P = 0.008). In general, the me- 
dian values from the grain-specific analyses 
tracked bulk 5”S,,, values for the same sam- 
ples. There was minimal offset between these 
data in interglacial samples, but bulk glacial 
3°“S,yr Values were more positive than the 
grain-specific medians by up to ~30%o (Fig. 3A). 
This discrepancy may arise from the wide range 
of 5°*Spyr Values in individual grains for these 
samples (Fig. 3, A and B), the necessarily in- 
complete sampling of individual grains, or the 
challenges of this method for measuring grains 
<1 um (29), which are expected to be enriched 
in *“S relative to the larger grains (fig. S1). For 
most pyrites, intragrain isotopic heterogeneity 
was <2%o and had no consistent directionality 
(fig. S2). However, some grains’ rims were 5 to 
15%o enriched in *“S relative to their cores (fig. 
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Fig. 2. Schematic explanations for bulk 8 Sve oscillations in PRGL1-4. (A) Differential burial depths 

of a parcel of sediment (empty black rectangle) and decline of organic matter content (brown-yellow 
gradient) and sulfate concentration (colored lines) below the chemocline in a given amount of time between 
interglacial and glacial settings. (B) Corresponding sulfate and sulfide 8°4S profiles, with arrows indicating 
the extent of &°“S values encountered by the sediment parcels in (A). Inset shows expected 3 Soyr 
distributions (colored lines) and bulk 5 Shy values (black vertical dashed lines). Glacial departures from 
the expected interglacial scenario, with large emi, and relatively open-system pore water evolution (red), may 
be due to no change in €,,;, and more closed-system pore water evolution (blue solid lines) or smaller 
Emic and relatively open-system pore water evolution (blue dashed lines), both yielding the same integrated 


bulk 8S, value. 


$3), indicative of evolving pore water sulfide 
isotopic compositions during pyrite growth. 
Assuming a constant water column 8°*S.ufate 
value of 20.6%o over the time interval studied 
(34), grain-specific Caner minima and modes 
suggest that the maximum apparent E;ic, a 
function of cell-specific sulfate reduction rate 
and reversibility of reactions (19-22), was near- 
ly invariant (~12%o variation, whereas bulk 
3 Sve values changed by ~47%o) and cen- 
tered around values of ~’70%o between 147 and 
65.3 ka during both glacial and interglacial in- 
tervals (Fig. 3A). This calculated €;,ic approaches 
the expected equilibrium fractionation between 
sulfate and sulfide at the temperature of the 
studied sediments (35), which is much larger 
than reconstructions of €,,;- that are based on 
prior bulk sampling work (74) but consistent 
with SIMS-based findings from other localities 
(36). Despite the profound environmental 
changes across glacial-interglacial regimes, 
nearly constant €,,i- values indicate no sub- 
stantive change in the activity of sulfate- 
reducing microbes. As such, the observed 
large temporal changes in bulk 5°4S,,,, in this 
time interval (~47%o) (15) were not driven by 
changes in microbial activity and instead re- 
flect other environmental changes. The large 
intergrain 8°**S variability observed in glacial 
pyrite populations (Fig. 3, A and B) provides 
evidence that basin-scale marine regression, 
which increased sedimentation rate at the site 
(15), decreased replenishment of sulfate by dif- 
fusion from the overlying water column rela- 
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tive to sulfate supply by burial of sulfate (i.e., 
decreased pore water openness). Such a sce- 
nario would produce strong enrichments in 
349 of sulfate and sulfide with depth in the 
sediment column (Fig. 2B) (23), which under 
continued pyrite formation, a function of reac- 
tive iron availability that decreases with time 
and burial (29, 30, 32, 33), would generate the 
tail of positive 5°“S,,, values observed in gla- 
cial samples (Fig. 3, A and B, and fig. S1). 

We developed a biologically informed dia- 
genetic model to calculate the evolution of 
8 Sve values in progressively buried sediment 
given estimates of sedimentation rate, poros- 
ity, bottom water oxygen concentration and 
temperature, organic carbon, and reactive iron 
mass fractions (37). When run with sedimen- 
tary parameters (table S2) inferred to reflect 
the deposition of these samples (15), the model 
produced pyrite histograms that broadly agree 
with the corresponding BS ie results pre- 
sented here (Fig. 3B). Higher minimal 8**S,,, 
values in the model than in the measurements 
may reflect imperfect model representation of 
oxidative processes in the sedimentary sulfur 
cycle, which tend to increase the sulfate-sulfide 
8S offset. Nevertheless, the model demon- 
strates that changing sedimentary parameters, 
most likely independently constrained (15) in- 
creased sedimentation rates, drove the change 
observed in the glacial 5*4S,,,, distributions. At 
the studied site, more-rapid glacial sedimen- 
tation and a steeper pore water sulfate concen- 
tration gradient led to higher net rates of 
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Fig. 3. SIMS 8S iyi data for samples from PRGL1-4. 
plots of SIMS 8 Spi values plotted against sample age. Individual data are 
colored by grain area and jittered for clarity. All bulk 8S yi data (15) and those 
corresponding to the samples used in this study are shown as black x symbols 
and gold stars, respectively. The gray region is the 95% confidence interval 


sulfate reduction, and this may be the case in 
other nearshore depositional environments. 
Other sedimentary parameters may cause dif- 
ferences in 5°*S,,, distributions by affecting 
the range of depths over which pyrite is formed 
[e.g., amount and reactivity of iron-bearing 
phases (22, 24)] or by changing the bulk rate of 
sulfate reduction at the level of the microbial 
community [e.g., the amount and reactivity of 
organic matter (12)]. Some of these parameters 
may be independently constrained (e.g., by 
iron speciation or total organic carbon mea- 
surements), but there was no evidence for 
such changes in the sediments studied here. 


Discussion 


Our results demonstrate that pyrite is a sen- 
sitive recorder of the diagenetic evolution of 
pore water sulfide in marine sediments, as sug- 
gested by previous studies (1-5, 29, 30, 36). That 
the magnitude of intergrain isotopic variability 
in modern marine sedimentary pyrite popula- 
tions can be so large (Fig. 3, A and B) and the 
magnitude of intragrain isotopic variability so 
small (fig. S2) supports the long-held notion 
that the uniformity of microcrystal size and 
morphology in a framboid reflects an initial 
rapid burst of nucleation, followed by a short 
duration of diffusion-controlled growth (38). 
In glacial samples, growth of framboids must 
have initiated over a range of depths and 


Bryant et al., Science 382, 912-915 (2023) 


24 November 2023 


Interglacial observations 


Simulation for observed 
interglacial sedimentary conditions 


Simulation for observed glacial 
sedimentary conditions 


iT er 


° | B 
Grain area (um?) hk 
= | hh 
: An i 


50 100 


(A) Box-and-whisker 


lasted only for short durations for most grains. 
In each case, growth may have been termi- 
nated by local reactant diffusion limitation, 
exacerbated by decreasing local permeability 
associated with authigenesis. In the few cases 
in which intragrain isotopic variability was 
present (e.g., fig. S3), it took the form of core- 
rim **S enrichments in grains with no dis- 
cernible microcrystals, suggesting that some 
framboids were overgrown or infilled by a later 
generation of pyrite (30), likely associated with 
the local dissolution of less-reactive iron phases 
(33, 39). 

The potential applications of this approach 
are numerous. In sediments formed beneath 
oxic water columns sampled at a high tempo- 
ral resolution relative to the residence time of 
sulfate in the ocean, the maximum offset be- 
tween 5™S,,, and contemporaneous seawater 
5°"S.uiate Provides an estimate of maximum 
Emic, and the range indicates the extent of 
closed-system pore water distillation. In the 
rock record, if coupled with seawater 8°*Surate 
data from barite, carbonate-associated sulfate, 
or evaporites (14, 40, 41), grain-specific §°*Spy, 
data could be used to reconstruct ancient €;ic 
without the issues plaguing current bulk ap- 
proaches (14). In both modern marine sedi- 
ments and sedimentary rocks, complementary 
geochemical, petrographic, and sedimentolog- 
ical data across stratigraphic variations in 8°*S,.y. 
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of a smoothed loess function through all bulk 5S oye data. (B) Density plots of 
the interglacial and glacial SIMS 8S values, weighted by grain sizes (top 
panel, red, and bottom panel, dark blue, respectively) and simulated 8 Sie 
values corresponding to interglacial sedimentary conditions (second panel, light 
red) and glacial conditions (third panel, light blue). 


distributions could allow conclusive identi- 
fication of the drivers of these variations. Such 
complementary data could include, for exam- 
ple, iron speciation measurements, porosity 
determination by petrography or tomography, 
and sedimentation rate constraints at the scale 
of an outcrop or sediment core. This approach 
could also be used to test previous interpre- . 
tations of 3” Sve records (e.g., about pulses of ‘ 
global pyrite burial or evolving seawater sul- 
fate reservoirs), especially when made in the 
absence of time-equivalent seawater 8°“Ssurfate 
data (11, 27, 42, 43). The approach developed in 
this study enables time- and locale-specific 
metabolic and depositional information to be 
readily obtained from sedimentary samples, 
providing scope for reassessing interpretations 
of the extensive but previously inscrutable 
bulk 8**S,y; record. 
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SULFUR CYCLE 


Sedimentary parameters control the sulfur isotope 
composition of marine pyrite 


|. Halevy™, D. A. Fike’, V. Pasquier’, R. N. Bryant, C. B. Wenl¢, A. V. Turchyn’, D. T. Johnston®, G. E. Claypool® 


Reconstructions of coupled carbon, oxygen, and sulfur cycles rely heavily on sedimentary pyrite sulfur 
isotope compositions (5**Spyr). With a model of sediment diagenesis, paired with global datasets of 
sedimentary parameters, we show that the wide range of 5 Soy (~100 per mil) in modern marine 
sediments arises from geographic patterns in the relative rates of diffusion, burial, and microbial 
reduction of sulfate. By contrast, the microbial sulfur isotope fractionation remains large and relatively 
uniform. Over Earth history, the effect of increasing seawater sulfate and oxygen concentrations on 
sulfate and sulfide transport and reaction may explain the corresponding increase observed in the 5°“S 
offset between sulfate and pyrite. More subtle variations may be related to changes in depositional 
environments associated with sea level fluctuations and supercontinent cycles. 


icrobial sulfate reduction (MSR) ac- 
counts for about one third of the or- 
ganic matter degradation in marine 
sediments (7) and when coupled to 
burial of pyrite (FeS,) is a major indi- 
rect source of atmospheric oxygen (O.) (2, 3). 
Fractionation of sulfur (S) isotopes during 
MSR (€mic) (4-8) leads to production of **S- 
depleted sulfide in sediment pore water, some 
of which reacts with iron to become preserved 
in pyrite. As a probe into organic matter deg- 
radation and O, production, the S isotope 
composition of pyrite in marine sedimentary 
rocks and its offset from that of coeval sea- 
water sulfate (Apyr = 8°*Scuttate — 5°*Spyx) (4) 
are widely used to reconstruct the coupled 


biogeochemical cycles of carbon, oxygen, and 
sulfur (2, 3, 7, 9). Modern marine sediments 
display bulk A,,, between ~—45 and 77%o (Fig. 
1), but this range has varied over Earth his- 
tory. For example, marine sedimentary rocks 
display A,,, values mostly <~15%o in the Ar- 
chean (4), up to ~40%o over much of the Protero- 
zoic, and up to ~75%o only in the Phanerozoic 
(Fig. 1) (0). 

Variations in €j¢ are often invoked to ex- 
plain variable A,,, records, as MSR is the largest 
fractionation source in the S cycle and thought 
to be dependent on local environmental con- 
ditions (17). Based on relationships between 
sulfate concentration ([SO,7"]) and ee in cul- 
ture experiments, the Archean-Proterozoic- 


t. 


Phanerozoic increase in Apy, has been attrib) Chee 
to increasing seawater [SO,7 ], following Lent J 
gressive oxygenation of Earth’s oceans and 
atmosphere (12). The range of A,,, in Phan- 
erozoic sedimentary rocks has been ascribed 
to environmental variation in the cell-specific 
sulfate reduction rate (csSRR) (7), using well- 
established experimental relationships between 
Emic and csSRR (5-8). However, natural csSRRs 
are usually low enough (J, 13-15) that both 
laboratory data and models predict €ic close 
to S isotope thermodynamic equilibrium be- 
tween sulfate and sulfide, €eq (8, 16) (~66 to 
78%o over the range of seafloor temperatures; 
-19 to 30°C) (fig. SIA). Indeed, modern se- 
diments exhibit evidence for near-equilibrium 
Emic (77-20), even in sulfate-poor environments 
(21, 22). This implies that, at least at present, 
MSR most commonly operates near equilibrium, 
weakening the csSRR- and [SO,” ]-based ar- : 
guments of €;nic as the driver of A,y, variations 
across modern environments and over geo- 
logic time. Critical controls on sedimentary 
8S ovr appear to be missing from our under- 
standing of the marine S cycle. 
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Fig. 1. Geologic and modern marine Apy-. (A) Compilation of reported Apy, values. (B) Histograms and medians (crosses) of Apy, in different Earth history intervals. 
Sample numbers are in parentheses. Arch, Archean; Prot, Paleo-Mesoproterozoic; Neoprot, Neoproterozoic; Paleo, Paleozoic; Meso, Mesozoic; Ceno, Cenozoic; 

and modern marine sediment cores, split into shallow (water depth <1,000 m, light gray) and deep ocean (water depth >1,000 m, dark gray). The Apy, values are 
not weighted by the sediment mass within a given depositional environment. 
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Observations and diagenetic models link 
sedimentary parameters, like the sedimenta- 
tion rate and the availability and reactivity of 
iron and organic carbon, to biogeochemical 
activity within the sediment and the isotopic 
composition of pore water sulfate and sulfide 
(23-28). These parameters of the physical en- 
vironment may come to influence microbial 
respiration rates and e,,;,, but importantly, 
they influence the relative rates of pore water 
sulfate burial, sulfate and sulfide diffusion, 
and production and consumption of these and 
other S-bearing compounds by (bio)chemical 
reactions (12, 18, 19, 23-28). For example, 
rapid sediment deposition quickly adds dis- 
tance between a progressively buried sediment 
package and the sediment-water interface, lim- 
iting the time available for diffusive resupply 
of pore water sulfate and potentially leading to 
complete sulfate consumption by MSR within 
centimeters to decimeters of the sediment-water 
interface. In this “semiclosed” system with 
respect to sulfate, the enrichment of residual 
pore water sulfate (and the sulfide produced 
from it by MSR) in *4S is pronounced. Con- 


versely, at a low accumulation rate, distance to 
the sediment-water interface increases slowly, 
allowing more time for diffusive sulfate ex- 
change with the water column. In this more 
“open” system, the pore water sulfate concen- 
tration and 8°*“S are more closely buffered to 
those of ambient sulfate, and complete pore- 
water sulfate drawdown may occur tens to hun- 
dreds of meters beneath the sediment-water 
interface, if at all. For illustrative purposes, 
neglecting nonsteady deposition and diage- 
netic reactions other than MSR and pyrite for- 
mation, in a completely closed sedimentary 
system (with respect to sulfate), sulfate is sup- 
plied exclusively by burial of seawater in the 
sediment pores, all sulfate is progressively 
consumed by MSR, and the 5°S of the pooled 
aqueous sulfide (and pyrite produced from it) 
will be identical to that of the initial seawater 
sulfate (i.e., bulk A,,, of zero). In a fully open 
system, the pore water sulfate concentration 
and 8*4S will be pinned to those of the over- 
lying seawater sulfate, the sulfate-sulfide 8°*S 
offset (and Apy,-) will be close to €mic, near 
70%. On the openness continuum, bulk A,y, 


a a a ee 
Table 1. Predicted and observed Apy, and &mic- Geologic 5°45 offsets between seawater sulfate and 
sedimentary pyrite (A,y,) and estimates of seawater [SO,?"] (10, 39-44), modern marine Apyr 
compiled in this study (SM), and model predicted emic and Apyr. The mean + lo and the median and 


range covering 68% (in parentheses) are shown. 


# of draws 
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values between zero and ~€,ic are, therefore, 
expected. 

Other factors may affect the openness of 
sedimentary systems. At a given sedimenta- 
tion rate, high availability of reactive organic 
matter (or methane) results in rapid pore- 
water sulfate drawdown by MSR and asso- 
ciated S isotope distillation. In concert, the 
sharp resulting sulfate concentration gradient 
increases the diffusive resupply of sulfate 
from the overlying seawater. In parallel, re- 
active iron availability and the organic matter- 
dependent rate of MSR affects A,,, by controlling 
the buildup and S isotope composition of 
porewater sulfide, as well as the proportion 
of sulfide that may be sequestered in pyrite. 
Similar to the importance of the sedimenta- 
tion rate on the openness of the sediment 
system with respect to sulfate, iron, and or- 


ganic carbon are critical for the openness with _ 


respect to sulfide (29). 

The physical and chemical controls on dia- 
genetic S isotope effects have long been ap- 
preciated (12, 18, 19, 23-28), but what remains 
unclear—despite being critical for understand- 
ing Earth history—is the degree to which 
preserved variation in Apy; is driven by sedi- 
mentary parameters (and the openness of the 
system) versus controls on €;ic- In a compan- 
ion paper, Bryant et al. demonstrate that the 
distribution of grain-specific 8°*Spy, values 
within a sample, rather than traditional bulk 
5°*S,yn independently resolves both microbial 
and depositional controls of A,,, (20). Here, 
we more generally explore the role of €ynic Var- 
iation and the dynamics of reaction and trans- 
port, as related to depositional parameters, in 
controlling A,,, in modern marine and an- 
cient sediments. To this end, we developed a 
coupled model of microbial fractionation and 
sediment diagenesis. The diagenetic model ac- 
counts for the reaction and transport (includ- 
ing ®’S and *4S) of aqueous sulfate, sulfide, and 
methane, and of solid organic matter, ferric 
iron oxides, zero-valent S, organic S compounds, 
iron monosulfide (FeS), and pyrite (full descrip- 
tion in supplementary materials). Aspects of 


this model are similar to previous models of 


sedimentary S diagenesis (e.g., 23-28), but 
differ in the completeness of sedimentary S cy- 
cling processes and, importantly, in the treat- 
ment of €,pic. That is, a metabolism-informed 
model (J6) embedded in the diagenetic col- 
umn uses csSRR and local concentrations of 
sulfate and sulfide to calculate (rather than 
prescribe) €ic. The resulting profiles (con- 
centrations and 8°*S values) of pore water 
sulfate, sulfide, and accumulating pyrite allow 
determination of A,y, values that account for 
the present understanding of the controls on 
€mic and for reaction and transport of the 
relevant S compounds. We validated the mod- 
el against available profiles of pore water sul- 
fate concentrations and 5*“S values and tested 
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Fig. 2. Apyr and &mic in the 80; 
modern shallow and deep 
ocean. Model A,y, Versus €mic 
in the (A) shallow ocean 60 + 
(water depth <1,000 m) and 

(C) deep ocean (water depth 

>1,000 m). The horizontal <= 40 
and vertical lines show the 
medians of the Apy, and mic 
distributions. Also shown 

are histograms and medians 
(crosses) of modeled and 
observed Apy in the (B) shallow 
and (D) deep ocean and 

(E) model ei, for shallow and 


Shallow 


deep settings. Model histo- 
grams in the shallow and deep 
ocean are orange and purple, 
respectively. Observed Apy, 
histograms in the shallow and 
deep ocean are light and 
dark gray, respectively. 


c 


pyr (#e) 


A 


its sensitivity to the parameters (SM, model 
validation and sensitivity analysis). 


Sedimentary parameters shape the 3°*S 
of modern marine pyrite 


To constrain the controls on the modern range 
and spatial distribution of A,,,, we compiled 
global gridded datasets (1°x1°) (SM, gridded 
datasets) of the parameters required for the 
diagenetic model (e.g., sedimentation rate, or- 
ganic carbon loading, porosity). We randomly 
sampled ~10,000 locations (out of ~40,000) 
and solved the coupled diagenetic-microbial 
model with the combinations of parameters at 
those locations. On the continental shelf and 
slope (depth < 1000 m) and on the continental 
rise and the abyssal plain (depth>1000 m), 
predicted €,i. values are 60 + 12%o and 65 + 
9%o (mean + lo), respectively (Table 1), within 
10 + 9%o and 16 + 11%o Of €¢q at the local sedi- 
mentary temperature (fig. S1). By contrast, the 
preserved A,,,, is lower, 38 + 11%o and 52 + 11%o 
in the shallow and deep ocean, respectively, 
and in good agreement with the distributions 
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20 40 60 


of observed A,,, in modern marine sediments 
(Table 1 and Fig. 2). 

The large range of predicted and observed 
Apyn despite relatively uniform, near-equilibrium 
Emic highlights a role for sedimentary param- 
eters in governing modern marine 8”*Spy» Or- 
ganic carbon and iron mass fractions, organic 
matter reactivity, and porosity all exert direct 
secondary controls on model A,,,, but the main 
determinant is the net sediment accumulation 
rate (SM, figs. S2 to S7), which affects system 
openness in several ways. First, as described 
above, the efficiency of diffusive resupply of 
pore water sulfate decreases with increasing 
net accumulation rate, leading to smaller bulk 
Apyr- Secondly, the delivery rate of organic matter 
to the seabed is often positively correlated with 
sedimentation rate, except at extremely high 
rates (30). Rapid delivery of reactive organic 
matter leads to higher overall (bulk, not cell- 
specific) rates of MSR and more rapid pore water 
sulfate consumption and isotopic distillation. 
Thirdly, higher rates of MSR lead to accumulation 
of pore water sulfide, mixing sulfide with dif- 
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ferent 5“S produced at different depths in the 
sediment and leading to isotopic homogeni- 
zation of the sulfide and smaller bulk A,,,. 

The factors that promote a small bulk A,y, 
value operate in tandem in the shallow ocean. 
More rapid sedimentation, including that of 
organic matter and reactive iron, occurs on the 
continental shelf and slope (30). The porosity 
at the sediment-water interface displays a 
scattered inverse correlation with sedimen- 
tation rate (31), with lower porosity further 
hindering diffusive sulfate resupply. Thus, rap- 
idly depositing, organic-rich, iron-rich, low- 
porosity sediments tend to preserve smaller 
Apyr 0n the continental shelf and slope, where- 
as more slowly depositing, organic carbon- and 
iron-lean, porous sediments tend to preserve 
larger A,y, in the deep ocean (Table 1 and Fig. 2). 
The spatial distribution of seafloor tempera- 
tures (high and low temperature in the shallow : 
and deep ocean, respectively) accentuates this 
pattern, due to the temperature dependence of 
Eeq, but this is a secondary control. 

The distributions of predicted and observed 
Apyr appear similar, with a slight tail toward 
lower observed A,,, at both water depth ranges 
and an overall narrower predicted distribution 
in shallow depositional environments than 
the observed distribution (Fig. 2, B and D). 
This mismatch likely arises from episodic or 
discontinuous depositional processes and/or 
small-scale spatial heterogeneity, which are 
expected to be more prevalent in shallow de- 
positional environments and are absent from 
the diagenetic model. Episodic sedimentation 
(32), sediment remobilization, and accompa- 
nying changes in the sediment oxidation state 
and organic matter/iron availability, are ex- 
pected to lead to preservation of smaller A,y, 
(10, 33). Unresolved small-scale spatial heter- 
ogeneity in the availability of surface-derived 
organic matter and methane produced within 
the sediment may additionally influence the 
degree of local consumption and *‘S enrich- 
ment of pore water sulfate and sulfide. Never- 
theless, with only modest differences between 
the predicted and observed A,,, distributions 
and given the large range of A,y, captured in 
the geologic record, we argue that the model is 
particularly equipped to understand the factors 
that control A,,, preserved in modern deposi- 
tional environments and over Earth history. 


The sedimentary rock record is biased toward 
shallow environments 


The ability to sample marine sediments from a 
diversity of depositional environments permits 
observation of the modern range of Apy, (~100%o; 
Fig. 2, B and D). The geologic record of marine 
sedimentary pyrite, however, seldom includes 
sediments from this full range of deposition- 
al environments. Prior to the late Mesozoic, 
records consist instead predominantly of sedi- 
ments deposited on the continental paleo-shelf 
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Fig. 3. The dependence of model Apy, and epic in the shallow ocean on seawater [S0,7] and [02]. 
Predicted (A) €mic and (B) Apy, against [SOu°-], for 100% (opaque, left) and 1% (translucent, right) of 
present bottom-water [O02]. Medians are shown as short horizontal lines, boxes cover 25th to 75th data 


percentiles, and thin vertical lines cover 95% of the d 


or slope, in water no deeper than a few hundred 
meters (34). The similarity between predicted 
and observed distributions of A,,, in modern 
marine shelf and slope sediments offers model 
predictions as a means to explore the envi- 
ronmental controls on 8°*S,,, in these envi- 
ronments. The predictions are affected by the 
present-day continental configuration and sea 
level, which govern the modern distributions 
of depositional parameters. Still, if these dis- 
tributions and the range of paleo-depths cap- 
tured by the sedimentary record have not 
changed systematically since the emergence of 
continental shelves (an assumption that we 
relax later), then the controls on production 
and preservation of geochemical signals in shelf 
and slope sediments may be explored. 

From the model results, we learn that the 
arithmetic mean (or median) of all geologic 
5°“Spyr Measurements is a flawed measure of 
the true global average A,,, at any given time, 
and that this is due to two factors: First, the 
strong preservation bias in favor of shallow 
marine environments limits contributions from 
deep-water depositional settings to global ge- 
ologic Apy;, estimates. Continental shelf and 
slope settings display more closed-system be- 
havior with respect to sulfate and sulfide and 
smaller Apyr (e.g., mean Apy, of 52 and 38%o, 
respectively, in the model deep and shallow 
ocean; Table 1). The geologic A,,,, which rep- 
resents almost exclusively shallow environ- 
ments, should thus yield an underestimate 
of the aggregate whole-ocean A,,,. Fortunately, 
the inaccuracy introduced by this preserva- 
tion bias is limited, as our model shows that 
~80% of all pyrite forms on the continental 
shelf and slope. Thus, in the modern model 
ocean, the shelf-constrained A,y, of 38%o is 
close to the integrated whole-ocean A,,, of 


ata. 


of inaccuracy arises from the difference be- 
tween the mean (or median) of a population 
of individual 5°“S,,, values and the formation 
rate-weighted average a Sie For example, 
mean model shallow 8°*S,,. is -17%o (Apyr = 
38%o), whereas the pyrite formation rate- 
weighted 3°4Sy.1 is -4%o (Apyr = 25%o). This 
effect, too, arises from the disproportionate 
contribution to global pyrite formation of rap- 
idly depositing, organic carbon- and iron-rich 
marine sediments, which display more closed- 
system behavior and relatively small A,,,. 
Thus, reconstructions of S-cycle fluxes using 
geologic A,,, would tend to underestimate the 
fractional pyrite burial flux, f,,.,,, the mea- 
sure used to extrapolate the oxygenation of 
Earth’s surface. For example, using the mod- 
ern mean Aj, of 38%, ~27% of the S leaves 
the ocean as pyrite, whereas with the rate- 
weighted Apy; of 25%o, pyrite burial is ~35% 
of the S outflux. Though model and data un- 
certainties render the degree of f,,,,, under- 
estimation uncertain, our results suggest that 
the effect may be substantial. We expect that 
other global parameters (outside of the S cycle) 
constrained using the shallow-water sedimen- 
tary rock record are similarly biased, impacting 
reconstructions of past seawater chemistry, 
climate, and biological activity. 


Drivers of variation in pyrite 5°4S 
through time 


Local depositional and sedimentary controls 
on preserved 5”“S,y, limit the utility of bulk 
sedimentary 5°4S,,,, measurements for recon- 
struction of the global S cycle, a popular avenue 
of past and present research. Temporal cor- 
relation of stratigraphic variations in 5°*S,y, 
among several locations may indicate a global 
phenomenon (JO, 17, 33, 35), though the driver 


41%o. A second and more substantial source 
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may be unrelated to the S cycle. For example, 
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glacial-interglacial variation in bulk 5°4S,,, 
appears to have been driven by changes in 
sedimentation rate in response to global sea 
level change (77, 20, 36), and this variation was 
neither caused by nor did it cause changes in the 
global S cycle. Nevertheless, with robust inde- 
pendent constraints on depositional conditions, 
local 5°*S,y, data may be used to constrain 
global S-cycle parameters, such as seawater 
[SO,” ]—a major component of seawater, the 
largest oxidant pool in the Phanerozoic ocean 
and a key indicator of Earth’s surface oxidation 
state. Conversely, given independent constraints 
on the concentration and 5S of seawater sul- 
fate, stratigraphic variation in 3” Syyr may be 
used to constrain changes in the depositional 
environment, informing basin and sea level re- 
constructions. Notably, compilations of 8°*Spy, 
data can still be used to describe Apy;, distrib- 
utions, which can be tied back to global S-cycle 
parameters, given knowledge of the feasible 
range (rather than local values) of sediment- 
ary parameters. 

We find that one of the main influences on 
Apyr is seawater [SO,7"]. Lower [SO,77] in early 
Cretaceous seawater has been suggested to 
have decreased global pyrite burial rates, there- 
by explaining negative excursions in seawater 
sulfate 5S (and presumably lower 5S. with 
no effect on Epic) (27, 28). Times of low 
Phanerozoic seawater [SO,7 ] have also been 
suggested to decrease €,ni-, resulting in corre- 
spondingly smaller A,,, (30, 34). We describe a 
different effect, where over the range of Phan- 
erozoic seawater [SO,” ] (~3 to 30 mM) (30-32), 
€mic remains relatively large whereas A,,, de- 
creases markedly with decreasing [SO,7 ] 
(Table 1 and Fig. 3). The dependence of A,,, 
on [SO,”"] is explained by the effects of 
seawater [SO,”"] on the openness of the sed- 
iment system with respect to both sulfate and 
sulfide. For given depositional parameters at 
low seawater [SO47], diffusive buffering of 
pore water sulfate concentrations and 5°“S to 
those of seawater sulfate is less effective. Con- 
sequently, pyrite formed in sediments under- 
lying a sulfate-poor water column will have 
more distilled 8°*S (i.e., closer to that of the 
sulfate source) and smaller A,,, values, an 
inference that is supported by observations 
and models of water columns with variable 
[SO,7 ] (e.g., 18, 19, 37). Moreover, for a given 
availability of reactive iron at lower seawater 
[SO,” ], more of the sulfide produced by MSR 
is likely to be sequestered in pyrite, also re- 
sulting in smaller integrated A,,,. 

As [SO,? ] decreases, the range of parame- 
ters that come to influence A,,, increases. For 
example, at [SO,2-] < 3 mM we find both a 
decrease in Emic to ~30 to 40%o (with a wide 
range between ~5 and ~65%o, depending on 
the csSRR) and near-zero Apy;, (Table 1 and 
Fig. 3). Such a combined effect of low [SO,” ] 
ON €mic and the degree of pore water sulfate 
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isotopic distillation has been previously sug- 
gested to explain the overall increase in Apyr 
over Earth history (72). We explicitly ac- 
count for the dependence of €,,j- on pore water 
sulfate and sulfide concentrations, and on the 
csSRR, explaining the decrease in €mie by two 
mechanisms working in concert. The first mech- 
anism, as suggested by culture experiments and 
models (72, 15), is a direct effect of low [SO,” ] on 
Emic (a “reservoir effect”). The second mecha- 
nism is related to the pooling of pore water sul- 
fide. At low seawater [SO,”"], sulfide production 
is slower and pyritization more complete, both 
lending to lower pore water sulfide concentra- 
tions. This thermodynamic lever on (farther-from- 
equilibrium) MSR leads to a correspondingly 
smaller €,i. (15). These two mechanisms for 
decreased €,nic at low seawater [SO,7-], together 
with more closed-system behavior with respect 
to sulfate (Le., greater isotopic distillation), lead 
to the near-zero Apyp 

The concentration of O, in bottomwater ex- 
erts an additional control on A,,, through the 
depth in the sediment where MSR (and more 
broadly S cycling) commences, and this con- 
trol may help to explain A,,, through time. We 
illustrate the role of O2 in model calculations 
with 1% present-day bottom-water [O.], cho- 
sen arbitrarily to represent most of Earth 
history, when the ocean interior is thought 
to have contained little or no Oz (e.g., 33-35). 
With lower bottom-water [O2], the preserved 
Apyr is larger (Fig. 3) as a result of enhanced 
pore water sulfide accumulation and a shal- 
lower chemocline (oxic-anoxic interface) within 
the sediment. Pore water sulfide accumulation 
is due to a smaller oxidative sulfide sink at and 
above the chemocline. Pore water sulfide con- 
centrations are high enough to result in closer- 
to-equilibrium MSR and larger associated Eyic 
(Fig. 3), but not so high that isotopic homoge- 
nization of the pooled sulfide mutes A,,,. The 
increase in €mic Overshadows an opposite ef- 
fect in which a shallower onset of anaerobic 
respiration within the sediments (i.e., a shal- 
lower chemocline) provides sulfate reducers 
with more abundant and more reactive organic 
matter, potentially leading to higher csSRR. 
Were it not for the higher pore water sulfide 
levels, the higher csSRR would lead to greater 
departure of MSR from equilibrium and smaller 
Emic. Separately from pore water sulfide accu- 
mulation, a shallower chemocline at lower 
bottom-water [O.] leads to more effective dif- 
fusive buffering of pore water sulfate concen- 
trations and 8**S values than those of the 
overlying seawater. Together, the effects of 
sulfide accumulation on €,,;. and the effects 
of a shallower chemocline on sulfate isotopic 
buffering produce larger Apy, when bottom- 
water [O.] is low (Fig. 3). 

Without appropriate constraints on the pa- 
rameters that influence the preserved range 
and average of A,y, through time, we refrain 


Halevy et al., Science 382, 946-951 (2023) 


24 November 2023 


from inverting this framework to constrain 
past seawater [SO47"]. Nevertheless, these in- 
sights may be used to make several inferences 
about the drivers of long-term changes in the 
average and spread of A,,, preserved in marine 
sedimentary rocks. First, by deconvolving the 
coupled effects of lower seawater [SO,?-] and 
lower bottom-water [O42], we highlight the ba- 
lance of sulfate drawdown and resupply in ad- 
dition to pore water sulfide pooling as key causes 
for low Archean and Paleo-Mesoproterozoic 
Apyr values (Figs. 1 and 3). We suggest a more 
minor but non-negligible role for smaller €Eynic 
when seawater [SO,”] < 3 mM. Variations in 
Phanerozoic seawater [SO,”"] may have also 
affected the range of preserved A,y, values, 
but almost exclusively through effects on the 
openness of the sediment system with respect 
to both sulfate and sulfide rather than through 
effects On E:ic- 

Secondly, our results show only modest sen- 
sitivity to the concentration and reactivity of 
organic matter (figs. S2 to S4). Thus, variations 
in and uncertainty about global rates of pri- 
mary productivity through time, the amount 
and quality of organic matter accessible to sul- 
fate reducers in sedimentary environments, 
and the O, dependencies of these and other 
parameters are not expected to translate into 
similar uncertainty in €mic and Apy;, values. 
Similarly, we find modest sensitivity to the 
concentrations of reactive iron (fig. $5). At re- 
active iron concentrations =1 wt %, even with 
the modern, high seawater [SO,7 ], there ap- 
pears to be enough iron to sequester, in pyrite, 
most of the sulfide produced by MSR. Uncer- 
tainty in geologic A,,, values is thus expected 
to be minor, given the purported greater avail- 
ability of iron in Earth’s ancient oceans. 

Relaxing the assumption of no change in 
depositional environments through time, our 
analysis makes testable predictions for times 
in Earth’s history during which such changes 
are independently documented. For example, 
changes in sedimentation rate driven by sea 
level variations during glacial-interglacial cycles 
substantially changed the average and distribu- 
tion of preserved Apy, (16, 19, 42). Similarly, 
the Cenozoic decrease in global sea level and 
associated loss of shallow-water depositional 
environments is suggested to have caused an 
increase in Apy,, which led to an increase in 
seawater sulfate 5°S of ~4%o between 52 and 
48 million years ago (38). We predict similar 
changes in A), in association with other large 
changes in global or regional relative sea level, 
and possibly with changes in global sedimen- 
tation rates related to supercontinent cycles. 

A comparison of compiled observations and 
the results of a coupled diagenetic-microbial 
model identifies depositional parameters, es- 
pecially sedimentation rate and its interplay 
with seawater [SO,”? ] and [O.], as key deter- 
minants of marine 5**Spy, More so than sea- 
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water temperature or pH, or variations in 
microbial respiration rates, it is the physical 
parameters of depositional environments that 
best explain the main features of the sedimen- 
tary pyrite record, both in the modern ocean 
and over Earth history. These insights invite 
reevaluation of existing 5°*S,,, records, with 
new opportunities emerging to constrain depo- 
sitional paleo-environments and the evolution 
of seawater chemistry. 
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MINERALOGY 


Dissolution enables dolomite crystal growth near 


ambient conditions 


Joonsoo Kim’, Yuki Kimura’, Brian Puchala’, Tomoya Yamazaki’, Udo Becker®, Wenhao Sun** 


Crystals grow in supersaturated solutions. A mysterious counterexample is dolomite CaMg(CO3)2, a 
geologically abundant sedimentary mineral that does not readily grow at ambient conditions, not even 
under highly supersaturated solutions. Using atomistic simulations, we show that dolomite initially 
precipitates a cation-disordered surface, where high surface strains inhibit further crystal growth. 
However, mild undersaturation will preferentially dissolve these disordered regions, enabling 
increased order upon reprecipitation. Our simulations predict that frequent cycling of a solution between 
supersaturation and undersaturation can accelerate dolomite growth by up to seven orders of magnitude. 
We validated our theory with in situ liquid cell transmission electron microscopy, directly observing bulk 
dolomite growth after pulses of dissolution. This mechanism explains why modern dolomite is primarily found 
in natural environments with pH or salinity fluctuations. More generally, it reveals that the growth and 
ripening of defect-free crystals can be facilitated by deliberate periods of mild dissolution. 


olomite CaMg(COs), is a thermodynam- 
ically stable carbonate mineral that com- 
poses about 30% of the sedimentary 
carbonate mineralogy in Earth’s crust 
(1-3). Despite its geochemical abundance, 
nearly two centuries of sustained scientific efforts 
have failed to precipitate dolomite in the labo- 
ratory near ambient conditions (4). One remark- 
able long-term experiment indicated a failure to 
precipitate dolomite at 25°C from dilute solution 
despite 1000-fold oversaturation after 32 years 
(5). The apparent contradiction between the 
massive deposits of dolomite in nature and its 
inability to grow from supersaturated solutions 
near ambient conditions is a long-standing mys- 
tery known as the “dolomite problem” (3, 6-9). 
Prominent theories for low-temperature do- 
lomite formation rely on extrinsic influences 
such as microbes (10), sulfate reduction (77), or 
exopolymeric substances (12). However, these 
explanations do not resolve the fundamental 
question of why a thermodynamically stable 
(13) inorganic carbonate mineral cannot di- 
rectly precipitate abiotically from supersatu- 
rated solutions. Dolomite can be grown in the 
laboratory by using high-temperature hydro- 
thermal setups (74); however, natural dolo- 
mite deposits are often formed in sedimentary 
environments that have remained below 60°C 
(15). One popular abiotic explanation for do- 
lomite growth inhibition is that the aqueous 
Mg”* ion has a strong hydration shell, which 
would lead to a large kinetic barrier for dehydra- 
tion (16). However, other Mg-containing rhombo- 
hedral dolomite analogs—such as norsethite 
BaMg(COz3). and PbMg(CO;).—readily precipi- 
tate in water (17, 18), suggesting that Mg”* de- 


‘Department of Materials Science and Engineering, University 
of Michigan, Ann Arbor, MI, USA. “Institute of Low 
Temperature Science, Hokkaido University, Sapporo, Japan. 
Department of Earth and Environmental Sciences, 
University of Michigan, Ann Arbor, MI, USA. 

*Corresponding author. Email: whsun@umich.edu 


Kim et al., Science 382, 915-920 (2023) 


24 November 2023 


hydration is not a kinetically prohibitive process. 
Moreover, the replacement of water with an 
organic solvent with a weak Mg”* solvation shell 
still does not promote ordered dolomite precip- 
itation (19, 20). Retrospective analyses of grain 
sizes and ordering in dolomite minerals indi- 
cate that fluid-mediated ripening or recrystal- 
lization may be needed to explain the origin 
of dolomite sediments (21-23), although the 
mechanisms that drive these processes remain 
unclear. 


Nucleation inhibition versus growth inhibition 


Failure to precipitate dolomite can either re- 
sult from nucleation inhibition or growth in- 
hibition. We first tested a nucleation inhibition 
hypothesis using density functional theory 
(DFT) to compare the classical nucleation barrier 
of dolomite against other carbonate minerals. 
We calculated the hydrated surface energy of 
dolomite to be Yaclomite = 0.21 J/m? (fig. S1) 
(24), which is similar in magnitude to that of 
calcite, Ycaicite = 0.19 J/m?, and aragonite, 
Yaragonite = 0.28 J/m? (25). Aragonite and cal- 
cite readily nucleate in average seawater super- 
saturations Of Ogragonite = 3.3 ANd Gcaicite = 3.6. 
By contrast, dolomite will always be less sol- 
uble than calcite and aragonite if the solution 
Mg/Ca > 0.74, such as in modern seawater 
(Mg/Ca = 5.2), for which the average dolomite 
supersaturation is Ogolomite = 4.6 (table S1). On 
the basis of these surface energies and super- 
saturation values, the dolomite nucleation bar- 
rier in marine fluids should be smaller than 
that of calcite and aragonite, which suggests 
that dolomite precipitation is not nucleation 
limited. Indeed, ordered dolomite does not even 
grow on freshly etched pristine dolomite (26), 
which would have offered a seed to overcome 
any nucleation barrier. 

We therefore anticipate that the inability 
to precipitate dolomite arises from growth 
inhibition. The atomic structure of the dolo- 
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mite step edge provides crucial clues tov ne 
understanding dolomite growth phenome...— 
ogy. Like calcite CaCO, dolomite is a rhombo- 
hedral carbonate, except dolomite exhibits 
alternating Ca”* and Mg”* layers perpendicu- 
lar to the [0001] direction (Fig. 1). The primary 
growth surface of rhombohedral carbonates is 
the close-packed (1014) surface, and the pri- 
mary step edges for growth and dissolution 
are along the symmetry-equivalent [481] and 
[441] directions (26-29). 

In ordered dolomite, Ca?* and Mg”* alter- 
nate along these [481] and [4.41] step edges (Fig. 
1B), meaning that Ca?* and Mg” ions need to 
deposit from solution onto the growing step 
edge with perfect alternating order. Such or- 
dered ion-by-ion attachment represents a highly 
improbable zero-entropy process. If the en- 
tropy of disordering overcomes the enthalpy 
of ordering, then the initially deposited step ; 
edge would exhibit Ca/Mg disorder. Experi- 
ments show that the initially precipitated do- 
lomite structure does have Ca/Mg disorder 
(26, 30), and atomic force microscopy (AFM) 
experiments have measured a Ca-rich surface 
with roughly Ca; 7Mgo.3(CO3). stoichiometry 
(31). Moreover, AFM measurements show that ‘ 
below 120°C, dolomite growth on the (1014) 
surface becomes self-limiting after one to 
five atomic layers (32). To rationalize the low- 
temperature growth inhibition of dolomite, ‘ 
we calculated from DFT that the formation en- 
ergy of dolomite layers on a disordered dolo- 
mite substrate is positive and unfavorable by 
as much as 15 kJ/mol carbonate (figs. S2 and 
S3) (24), which arises because of a high strain 
energy from atomic mismatch between the 
growth layer with the underlying disordered 
substrate. 

At equilibrium, the dissolution and growth 
rate of a crystal should be equal. However, if « 
dolomite is initially precipitated in a disor- ‘ 
dered protodolomite form, then this disordered 
structure will fundamentally be in a non- 
equilibrium state. On this disordered surface, 
locally more-ordered regions are correspond- 
ingly more stable and therefore have lower 
solubility. Disordered regions will therefore 
dissolve faster than ordered regions, whereas 
ordered regions of the surface, once formed, 
will dissolve more slowly. Over time, dissolution- 
reprecipitation processes (33) should gradually 
evolve a disordered dolomite surface toward 
an ordered surface, upon which subsequent 
dolomite layers can grow (fig. S4). However, dis- 
solution processes are kinetically rate-limited 
in supersaturated solutions. If the solution is 
instead cycled between supersaturation and 
undersaturation, then both dissolution and 
reprecipitation processes can be activated 
iteratively. 

We conducted atomistic simulations of do- 
lomite dissolution-reprecipitation on an or- 
dered dolomite (1014) surface substrate (figs. 
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S5 to S15) (24). To simulate dolomite growth 
with quantum-mechanically derived energies 
across geological timescales, we trained a clus- 
ter expansion energy model with DFT calcu- 
lations of disordered dolomite surface layers 
using an implicit surface solvation model. We 
used this accurate and computationally effi- 
cient energy model to run kinetic Monte Carlo 
(KMC) simulations of step-edge growth on 
dolomite surface unit cells of 6.23 by 6.23 nm. 
Each atomistic dissolution or reprecipitation 
step was sampled probabilistically with Monte 
Carlo, and the corresponding real-time incre- 
ment of each step was proportional to ¢ ~ exp 
(AE,/kpT), where AE, is the transition state 
activation energy barrier of each microkinetic 
reaction step, Xz is the Boltzmann constant, 
and Tis temperature. By referencing the KMC 
step-edge growth rate to experimental dolo- 
mite step-edge growth velocities (table S2), we 
could estimate the corresponding real-time 
progression of our KMC simulations. This ap- 
proach represents the highest level of ab initio 
theory feasible for simulating crystal growth at 
nanometer length scales and geologically rele- 
vant timescales. 


Dolomite growth at constant supersaturation 


We first simulated dolomite step-edge growth 
at constant supersaturation and 25°C, with 
ion concentrations of [Ca?*], [Mg?*], [CO;” ] 
and pH representative of modern seawater 
(24, 34, 35). We plotted the surface formation 
energy of dolomite at corresponding Ca/Mg 
compositions and cation ordering (Fig. 2A), 
where we define ordering as +1 for perfect order, 
-1 for perfect antiorder (all cations on the wrong 
site), and O for complete disorder. For various 
possible surface orderings, we show the lowest- 
energy state at a given Ca/Mg composition 
and cation ordering (Fig. 2A). The orange trace 
in Fig. 2A indicates the time evolution of sur- 
face ordering through dissolution-precipitation 
as calculated from our KMC simulations. We 


show the KMC simulation in movie S1, with 
various snapshots from the simulations visual- 
ized in Fig. 2B. 

In our simulations, dolomite step-edge growth 
initially results in a cation-disordered surface that 
is consistent with previous experiments (26), 
with an initial composition of Ca,;Mgo5(COs3)s, 
which is similar to previous low-temperature 
AFM experiments (37). By computing the ca- 
nonical ensemble and Helmholtz free energy 
of the disordered surface, we confirmed that 
the entropy of cation disordering exceeds the 
enthalpy of ordering in the early stages of do- 
lomite growth (fig. S16) (24). 

Over the course of 11,000 KMC steps, dissolution- 
reprecipitation processes evolve the high-energy 
surface to the stable ordered dolomite con- 
figuration. This ordering process proceeds in 
log time; 50% ordering occurs in ~10° years, 
75% ordering occurs in ~10° years, and 100% 
ordering takes ~10’ years. In the early stages 
of step-edge growth, each dissolution step takes 
~10° s, but as the surface orders and becomes 
more stable, dissolution becomes increasing- 
ly unfavorable—requiring ~10’ seconds when 
the surface is 75% ordered. Because dissolu- 
tion is needed to eliminate disordered por- 
tions of the surface, our simulations show that 
dissolution is the rate-limiting step under con- 
stant supersaturation. This explains why a 
32-year experiment failed to precipitate dolo- 
mite despite 1000-fold oversaturation (5): The 
high constant supersaturation inhibited the 
essential dissolution process needed to elimi- 
nate disordered regions. 


Dolomite growth under 
fluctuating supersaturation 


In general, laboratory investigations of dolo- 
mite growth have been conducted under con- 
stant supersaturation (4, 36). However, in 
natural environments, the supersaturation 
of fluids around dolomite can be dynamic and 
fluctuating (37-42). Although modern dolo- 


Fig. 1. Dolomite crystal structure and growth surface. (A) The orientation of the (1014) growth surface 
with respect to the conventional unit cell. (B) [441] step edge on the (1014) growth surface of the ordered 
dolomite crystal. (Inset) Top view of the (1014) surface unit cell. In ordered dolomite, Ca** and Mg** 
demonstrate alternating order along this [441] step edge. 
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mite deposits are rare, they tend to be found in 
small abundances in coastal and evaporative 
environments (table $4) that experience cycles 
of meteoric precipitation (rain, snow, or hail) 
followed by evaporation (37, 40, 42, 43). Excess 
freshwater would lead to dolomite undersat- 
uration, dissolving the more-soluble, disordered 
surface regions (Fig. 3, A and B). Subsequent 
evaporation would then result in dolomite 
supersaturation, promoting dolomite repre- 
cipitation in the recently vacated surface sites, 
resulting in a surface with slightly higher over- 
all ordering. In addition to salinity fluctuations, 
carbonate solubility also varies with temper- 
ature, pH, and other factors that can exhibit 
daily to seasonal fluctuations in biogeochem- 
ical environments (36, 44, 45). For example, 
organic matter oxidation in marine pore wa- 
ters can form CO, and carbonic acid, where 
transient low pH could also promote dolomite 
dissolution (43). 

We next repeated our KMC simulations with 
the solution saturation cycling between super- 
saturated (o = 4.6) and different levels of un- 
dersaturation (fig. S17) (24). Our simulations 
show that the dolomite ordering process re- 
duces from 10’ years under constant supersat- 
uration to 10° years when the undersaturation 
level is o = -1.5, 10* years when o = -3.4, and 
10° years when o = -9.2 (Fig. 3C and fig. S17) 


(24). As a reference, the undersaturation of * 


dolomite in rainwater is o = -14.16 (46, 47). 
During these dissolution cycles, our simula- 
tions showed a gradual replacement of cations 
on incorrect sites with the correct cations (figs. 
S18 to S22) (24). 

Dolomite ordering is accelerated because the 
rate-limiting dissolution step decreases from 
10’ s under constant supersaturation to <107 s 
in undersaturated solutions. On the basis of 
these simulations, we semiquantitatively assert 
that supersaturation fluctuations can accelerate 
dolomite ordering by up to seven orders of 
magnitude. We further performed KMC sim- 
ulations that consider multiple dissolution 
steps per supersaturation cycle (Fig. 3E), re- 
vealing that Mg”* ions incorrectly located on 
Ca** sites can be easily dissolved, but properly 
situated Mg** ions tend to persist and con- 
strain the extent of dolomite dissolution, as 
indicated with the gray regions in Fig. 3E (fig. 
$23). A video of the KMC simulation under 
fluctuating supersaturations is provided as 
movie S2. 


Direct TEM observation of dolomite growth 
under supersaturation fluctuations 


To validate the importance of dissolution in 
dolomite growth, we designed an in situ liquid 
cell transmission electron microscopy (TEM) 
experiment in which the electron beam is used 
to both trigger dolomite dissolution as well as 
to characterize the resultant crystal growth. 
In addition to a low ionic salinity, dolomite 
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undersaturation can also be thermodynami- 
cally driven by low pH. We used a TEM beam 
dose rate of ~17 electrons/nm?/s to drive ra- 
diolysis of water molecules (48, 49), which 
lowers the pH enough to dissolve dolomite 
(24). When the beam turns off, the solution 
can reequilibrate in milliseconds to a super- 
saturated state (49). We placed ~3-1m dolo- 
mite mineral crystals (Fig. 4A and fig. $24) in 
a flowing supersaturated solution (o = 3.1) at 
80°C; we then cycled between dissolution 
for 10 ms with the electron beam on and 2 s 
with the electron beam off. This cycle was re- 
peated 3840 times over 128 min. 

We directly confirmed dolomite growth by 
comparing the contrast of the dolomite image 
before and after irradiation from pulsed elec- 
tron beams (figs. S25 and S26) (24). Because 
each dolomite nanoparticle is a single crystal 
and maintains its crystallographic orientation, 
the contrast in the TEM image correlates lin- 
early with nanocrystal thickness. Comparing 
the initial boundary of the dolomite crystal 
(Fig. 4, A and B, blue) with the boundary after 
3840 dissolution cycles (Fig. 4B, red) shows 
radial crystal growth of 60 to 170 nm, corre- 
sponding to 200 to 560 dolomite monolayers, 
respectively. To visualize the growth contrast 
more clearly, an animation comparing the be- 


Fig. 2. Dolomite step-edge 
growth and ordering through 
dissolution-reprecipitation 
under constant super- 
saturation as simulated 
with kinetic Monte Carlo. 
(A) Surface formation energy 
is plotted versus composition 
and cation ordering. The 
owest energy configuration 
for a given composition and 
ordering is plotted as the 
blue-green surface. The 
coordinate for ordered dolo- 
mite is at CagsMgo5CO03 
composition, +1.0 ordering, 
and surface formation energy 
of AE = -13.27 kJ/(mol sur- 
face CO3). The orange trace is 
the KMC simulated progres- 
sion of dolomite surface evo- 
lution from a Ca-rich initial 
carbonate to an ordered dolo- 
mite layer over the course of 
10’ years under constant 
supersaturation. Two-dimensional 
projections of the energy surface 
are also shown. (B) The 
atomic configuration of the 
evolving surface at five differ- 
ent KMC steps. The initially 
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fore and after images is provided in movie S3. 
The electron diffraction pattern in the newly 
grown region (Fig. 4, B and E, green) shows 
diffraction peaks corresponding to ordered 
dolomite. 

As a control experiment, we ran the same 
experiment at constant supersaturation for 
146 min with the electron beam off for the 
entire duration. Imaging the resulting prod- 
uct showed no crystal growth (fig. $27 and 
movie S4) (24). Additionally, dolomite disso- 
lution was observed if the electron beam was 
constantly on. Dolomite growth was only ob- 
served when the beam strength was pulsed, 
corresponding to cyclic fluctuations in solu- 
tion supersaturation. 

Our earlier calculation showed that dolomite 
growth on a disordered underlying substrate 
is thermodynamically unfavorable (figs. S2 
and S3) (24), so our observation of bulk dolo- 
mite growth here implies that the beam-induced 
dissolution-reprecipitation cycles have suffi- 
ciently ordered the underlying dolomite lay- 
ers enough to facilitate bulk dolomite crystal 
growth. On the basis of an average growth rate 
of 360 monolayers per 3840 dissolution cycles, 
this indicates that ~10 dissolution cycles are 
needed to order each surface layer enough for 
the next monolayer to grow. 
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disordered dolomite surface evolves to complete order within ~10’ years. 
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Although our experiments were run at 80°C, 
an AFM experiment showed that even at 100°C, 
only one layer of dolomite could grow within a 
7-hour observation window (32). Our direct lab- 
oratory observation of 100 nm of dolomite crystal 
growth after 3840 supersaturation cycles within 
2 hours validates the importance of dissolution 
in facilitating dolomite ordering and growth. 


Discussion 


Our theory supports the argument that Mg”* 
dehydration is not the primary kinetic limita- 
tion to bulk dolomite growth (79, 20). Rather, 
dolomite growth is inhibited by Ca/Mg disor- 
dering along the [481] and [441] step edges, 
leading to a strained and disordered surface 
that inhibits further layer growth. We also cal- 
culated for dolomite that the Mg-Ca antisite 
energy (a single defect pair of Aions on B sites) 
is 0.29 eV per antisite, which is less than a fifth : 
that of norsethite BaMg(COs). at 1.65 eV per 
antisite (fig. S28) (24). This higher antisite en- 
ergy in norsethite arises from the greater mis- 
match in Mg/Ba ionic radii, which prevents the 
entropy of cation disordering from overcoming 
the enthalpy of Mg-Ba ordering. This explains 
why ordered norsethite can precipitate directly 
from supersaturated solutions, whereas ordered 
dolomite cannot (77, 18). 
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Fig. 3. The role of supersaturation fluctuations in accelerating dissolution- 
reprecipitation processes in dolomite ordering. (A and B) Although rare, 
modern dolomite deposits are observed where salinity fluctuations are 
common, such as (A) evaporative and (B) coastal environments. (C and 

D) We simulated the salinity fluctuation by means of (C) step-regulated 
supersaturation fluctuation, in which supersaturation is inverted every step, 


Overall, our atomistic simulations provide a 
mechanistic theory to interpret the role of sa- 
linity fluctuations in phenomenological dolo- 
mitization models—such as the hypersaline 
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model (50), reflux model (57), mixing zone mod- 
el (52), and seawater model (53). New ques- 
tions emerge regarding how these atomistic 
mechanisms extend from microscopic to ge- 
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and (D) time-regulated KMC simulation, in which supersaturation is inverted after 
a time limit (fig. S17) (24). Under supersaturation fluctuations, the cation 
ordering process is accelerated by orders of magnitude compared with ordering 
under constant supersaturation. (E) Snapshots of surface structures from time- 
regulated supersaturation fluctuations at o = -1.5, with corresponding times 


ological length scales. Do supersaturation fluc- 
tuations in nature occur on a daily, seasonal, 
or annual cycle? Can microorganisms and bio- 
chemical mechanisms facilitate supersaturation 
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Fig. 4. In situ liquid cell TEM images of dolomite crystal growth. (A) Dolomite 
crystal before beam-induced dissolution cycles. (B) Crystal after 3840 dissolution 
cycles. The blue dashed lines (before dissolution cycles) and the red dashed 
lines (after dissolution cycles) indicate the crystal edges quantified with 

image contrast analysis (24). The blue dashed lines in (A) are identical to the 
ones in (B). The blue and red dashed lines show that the crystal grows in all 


fluctuations (43, 54)? How does the burial 
dolomitization rate depend on sediment grain 
size, ion budget, pH changes, rock porosity, 
pore water replenishment, and pore connec- 
tivity (55)? 

Kaczmarek and Sibley showed that nat- 
ural well-ordered dolomites formed through a 
layer-by-layer crystal growth mechanism (56). 
Given our experimentally estimated dolomite 
growth rate of one monolayer per 10 disso- 
lution cycles, it is surprising that layer-by-layer 
growth can produce the massive dolomite 
deposits in nature, such as the Dolomite Moun- 
tains of Northern Italy or the caprock of Niagara 
Falls, USA. Then again, crystal growth rates 
as slow 300 nm/year can still yield enormous 
meter-sized single crystals over geological 
timescales, as evidenced by the giant gypsum 
crystals of the Naica ore mines in Chihuahua, 
Mexico (57)—and dolostone certainly does 
not need to be single crystalline. Furthermore, 
highly ordered dolomites tend to be found 
in mineral samples older than ~100 million 
years (22, 23, 58), although younger ordered 
samples (59) may be reconcilable with more 
frequent dissolution events. Given that conti- 
nental drift also occurs on a timescale of 100 mil- 
lion years, we speculate that an interesting 
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research direction may be to connect dolomite 
ordering rates to tectonic movement, in which 
a moving land-sea interface could potentially 
be a source of dissolution-reprecipitation for 
massive dolomite generation. 

From a crystal growth perspective, our work 
further reveals a general insight that the growth 
and ripening of ordered crystals can be kinet- 
ically accelerated by deliberate periods of mild 
dissolution. Any defect in a crystal—whether 
that defect is disorder, dislocations, impurities, 
or otherwise—will fundamentally be in a non- 
equilibrium state. Defective regions are higher 
in energy than pristine regions and thus will 
dissolve faster and grow slower, which over 
time results in a net flux of atoms from defec- 
tive to pristine sites. By deliberately introduc- 
ing periods of mild undersaturation, one can 
facilitate the dissolution of defects, whose dis- 
solution would otherwise proceed very slowly 
under constant high supersaturation. Beyond 
low solute concentrations, undersaturation 
can also be driven by temperature oscillations 
(60), etching (61), voltage in electrodeposi- 
tion (62), redox potential in natural soils (63), 
and so on (64), which can all serve as viable 
control handles to accelerate the kinetics of 
ripening. 
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directions after 3840 dissolution cycles. The growth rate ranges from 8 to 

30 nm per 1000 cycles, depending on the direction. (©) Schematic of the 

in situ liquid cell TEM experiment. (D) Electron diffraction pattern measured . 
from the area indicated with a green dashed circle in (B). Diffraction 

spots shown by white triangles have been assigned as ordered dolomite 
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THERMOELECTRICS ucts that are formed from TE materials ches 


pure elements are at the final chemical eq Las i 


Screening strategy for developing thermoelectric 


interface materials 


Liangjun Xie"|, Li Yin+, Yuan Yu°+, Guyang Peng“, Shaowei Song®, Pingjun Ying®, Songting Cai’, 
Yuxin Sun’, Wenjing Shi?, Hao Wu, Nuo Qu’, Fengkai Guo’, Wei Cai?, Haijun Wu’, Qian Zhang’, 
Kornelius Nielsch®, Zhifeng Ren®, Zihang Liu’*, Jiehe Sui’* 


Thermoelectric interface materials (TEiMs) are essential to the development of thermoelectric 
generators. Common TEiMs use pure metals or binary alloys but have performance stability issues. 
Conventional selection of TEiMs generally relies on trial-and-error experimentation. We developed a 
TEiM screening strategy that is based on phase diagram predictions by density functional theory 
calculations. By combining the phase diagram with electrical resistivity and melting points of potential 
reaction products, we discovered that the semimetal MgCuSb is a reliable TEiM for high-performance 
MgAgSb. The MgCuSb/MgAgSb junction exhibits low interfacial contact resistivity (p, <1 microhm 
square centimeter) even after annealing at 553 kelvin for 16 days. The fabricated two-pair MgAgSb/ 
Mg3.2Bi;5Sbo.5 module demonstrated a high conversion efficiency of 9.25% under a 300 kelvin 
temperature gradient. We performed an international round-robin testing of module performance to 
confirm the measurement reliability. The strategy can be applied to other thermoelectric materials, 
filling a vital gap in the development of thermoelectric modules. 


hermoelectric generators (TEGs) have been 
applied to power deep space exploration 
since the 1960s owing to the advantages of 
simple system structure, long-term stabil- 
ity, and vibration-less operation (7-3). In 
recent years, TEGs have exhibited high poten- 
tial in waste heat recovery (4). However, because 
of the challenges in designing reliable thermo- 
electric interface materials (TEiMs), the exten- 
sive commercial application of TEGs remains 
stagnant despite breakthroughs in developing 
thermoelectric (TE) materials (5-8) and im- 
provements in their performance (9-13). Spe- 
cifically, the wide-bandgap semiconductors may 
have the potential to achieve a high performance 
at a wide temperature range (/4). TEiMs aim 
mainly to suppress diffusion of elements and/or 
chemical reactions and improve solderable ca- 
pability while maintaining low energy loss in 
the transport of heat and electricity between 
TE materials and electrode strips (15). There- 
fore, the related performance stability of TEGs 
depends critically on the TEiM, especially at ele- 
vated temperatures. 
Traditionally, TEiMs were based on the prin- 
ciple of matched coefficient of thermal expansion 
(CTE) for mechanical reliability and coordinated 
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work function (W,) for low interfacial contact 
resistivity (p.), leading to the selection of sev- 
eral pure metals or alloys as potential TEiM can- 
didates for the corresponding TE materials 
(15, 16). Then, a junction structure composed of 
TEiM and TE materials was prepared to inves- 
tigate the p, and thermal stability after high- 
temperature annealing. On the basis of this 
design principle, several TEiMs were success- 
fully identified, such as Ni for p-type Bi,Tes 
(17), Nb for filled skutterudites (78), Al for 
Mg,(Si, Sn) (19), Fe (20) and 304 stainless steel 
(16) for n-type Mg3Sby, Au for sulfide (27), and 
MoFe for NbFeSb half-Heusler (22). However, 
this conventional method gradually became 
ineffective in the selection of reliable TEiMs 
with the emergence of high-performance TE 
materials. For one thing, the search and devel- 
opment of reliable TEiMs strongly depends on 
Edisonian trial-and-error experimentation that 
entails synthesis, processing, long-term anneal- 
ing at elevated temperatures, and properties 
characterization (23), which is time-and cost- 
consuming. For another, designing a highly 
stable interface between TEiM and TE mate- 
rials is challenging, especially for these cur- 
rently finite metal species. The long-term service 
at the elevated temperature may promote suf- 
ficient chemical reactions and the appearance of 
detrimental secondary phases (24, 25). Subse- 
quently, the mismatch of Wy; and CTE between 
in situ secondary phases and TEiM/TE materials 
will inevitably yield a higher p, and fracture risk 
of the module, thus triggering degradation of 
conversion efficiency (n) and even failure of the 
TEG. Therefore, a systematic screening strategy 
for highly stable TEiMs is urgently required for 
the long-term efficient service of TEGs. 

From the viewpoint of the Gibbs energy of 
the constituted system, when chemical prod- 
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rium, a driving force would not be provided 
for the formation of second phases. In addi- 
tion, regarding the suitability of TEiMs, both 
CTE and W; must be matched to further en- 
sure the long-term interface thermal stability 
chemically and mechanically. In particular, 
phase diagram calculations offer theoretical 
predictions of the state of a system based on 
Gibbs energies, which enable rational screening 
of a large set of compounds that have a stable 
two-phase equilibrium with the host matrix and 
therefore effectively reduce the experimental 
effort required to discover these targeted re- 
action products. Leveraging the rapid estab- 
lishment of The Open Quantum Materials 
Database (OQMD) (26, 27), various ternary 
and even quaternary phase diagrams can be 
easily constructed. 

In recent years, MgAgSb/Mg3Bi. TE modules 
have shown great promise in low-temperature 
thermal energy harvesting because of their 
high yn and abundant composition elements 
(28-31). As a result, these modules may replace 
classical BigTes-based TE modules. Notably, Ag 
was selected as TEiM for p-MgAgSb owing to 
the low p, and well-matched CTE (28-32). How- 
ever, under Ag-rich conditions, a Ag3Sb phase 
can be easily formed for MgAgSb (30, 33-35), 
which may lead to performance deterioration 
of MgAgSb/Mg;Bi, modules. In addition, there 
is a lack of research on the long-term thermal 
stability of Ag/MgAgSb junctions. Therefore, 
the development of a highly reliable TEiM for 
Mg<Agsb is crucial for the long-term applica- 
tion of MgAgSb/Mg;Bi, modules. 

Taking MgAgSb as a typical case study, we 
established a TEiM screening strategy with the 
help of phase diagram calculations. According 
to this screening strategy, we predicted that 
semimetal MgCuSb should be a promising TEiM 
for MgAgSb with low p, and matched CTE. The 
detailed microstructure investigation on the 
MgCuSb/MgAgSb composite indicated that 
MgCuSb had a stable two-phase equilibrium 
with MgAgSb. In addition, the MgCuSb/MgAgSb 
junction demonstrated an ultralow p, (<1 
microhm cm?) even after annealing at 553 K 
for 16 days. We showed that a two-pair MgAgSb/ 
Mg32Bi,;Sbo,; module using MgCuSb as TEiM 
exhibited a very high conversion efficiency of 
9.25% with the cold-side temperature (T,) of 
293 K and the hot-side temperature (7),) of 593 K. 
These values were confirmed by an international 
round-robin test. Meanwhile, our proposed strat- 
egy can be applied to other representative TE 
systems, which serves as a powerful and generally 
applicable means for screening highly stable 
TEiMs for power-generation applications. 


Results and discussion 


We produced an overall flowchart for our 
combinatorial TEiM screening strategy for 
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Fig. 1. The screening strategy of stable thermoelectric interface materials. (A) Flowchart of the TEIM screening strategy for a case study of MgAgSb. (B) Mg-Ag-Cu-Sb 
quaternary phase diagram created by using the OQMD database (26, 27). (©) The room-temperature electrical resistivity p,, and melt point 7, of these selected compounds 


and the referenced sample MgAgSb. 


MgAgsSb (Fig. 1A). A stable and reliable TEiM 
screening process involved the following four 
steps. (i) To obtain as many potential TEiMs as 
possible, we selected 17 traditional TEiM metals 
that are neither toxic nor radioactive (Al, Ti, V, 
Cr, Fe, Co, Ni, Cu, Ga, As, Zr, Nb, Mo, In, Hf, W, 
and Bi) as candidates. (ii) To screen compounds 
that can have a stable two-phase equilibrium 
with the MgAgShb, we established 17 M-Mg-Ag-Sb 
(M is the above-mentioned metal) quaternary 
phase diagrams according to the OQMD database 
(figs. S1 and S2 and Fig. 1B). A preliminary exam- 
ination allowed us to reference 17 compound- 
metal candidates that can have a stable two-phase 
equilibrium with MgAgSb (see table S1 for the 
full list). Taking Cu-Mg-Ag-Sb as an example 
(Fig. 1B), MgCuSb was one of the thermody- 
namically stable phases in the phase region of 
MgAgSb-MgCuSb-Mg3Sbo, implying that no 
other phases can be generated at the MgCuSb/ 
Mg<AgsSb interface. Consequently, we identified 
MgCuSb as a stable TEiM for MgAgSb. (iii) The 
room-temperature electrical resistivity (p,,) and 
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melt point (7,,) of the selected compounds and 
metals were used as key criteria in screening 
for the desired TEiMs. In general, n and output 
power density (Pq) of TE devices were also 
determined by p,. Pei et al. suggested that the 
ratio of interfacial contact resistance to internal 
resistance should be less than 3% to achieve the 
full potential of TE materials (36, 37). However, 
the Joule heat caused by TEiMs should not be 
overlooked when the intrinsic resistivity of 
TEiMs is high. Assuming the length of TE 
material of ~3 mm (close to the conventional 
value for commercial TE devices), the thickness 
of TEiM of ~0.3 mm, and ignoring the p,, the 
obtained p,; should be less than 2.6 microhm m 
in this work (fig. S3). Furthermore, according to 
the solid-state sintering theory, the sintering 
temperature (7,) of metal compounds should 
be higher than ~0.4 T,, to obtain a bulk material 
that is dense (38, 39). As a result, limited by 
the T, (300°C) of MgAgSb, 7,, of these se- 
lected compounds and metals should be lower 
than 750°C. The reported p,, and 7, of these 
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compounds are listed in table S1. On the ba- 
sis of these two criteria, we proposed a two- 
dimensional (2D) map (Fig. 1C), in which a 
“sweet spot” area that contains MgCuSb and 
MgNiSb was located in the lower left corner as 
marked by the orange color. The specific criteria 
for both p,, and Ti, were limited by the prep- 
aration process of TEiM. Once we optimize the 
preparation process, such as going from one- 
step sintering to electroplating, vapor deposi- 
tion, or thermal spraying, more compounds 
and metals may become suitable TEiM candi- 
dates. (iv) We conducted experimental veri- 
fication of p, and thermal stability to further 
determine the feasibility of MgCuSb and MgNiSb 
as reliable TEiMs for MgAgSb. 

In accordance with our screening strategy, 
we selected MgNiSb and MgCuSb as two desired 
stable TEiM candidates for MgAgSb. Despite 
making several attempts, we could not synthesize 
pure MgNiSb as NiSb and Mg;Sb, phases were 
present (fig. S7). Elemental mappings using 
energy-dispersive spectroscopy (EDS) revealed 


2 of 8 


RESEARCH RESEARCH ARTICLE 


~ 
n 


loa) 
Nn 


n 
bie 


Ag composition (at. %) 


Fig. 2. Microstructure analysis of MgAgSb/MgCuSb composite material. 
(A) HAADF STEM image showing the presence of nanoprecipitates in the matrix. 
Seven areas were selected to investigate the composition of two precipitates and 
matrix. (B) Enlarged STEM image of the representative Cu-rich nanoprecipitate. 
(C) High-resolution transmission electron microscope image of the enlarged area 
in (B) showing a clear phase boundary between the precipitate and the matrix. 
The insets are the corresponding FFTs of two phases, which are calibrated as 
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MgCuSb (top) and MgAgSb (bottom), respectively. (D) 3D reconstruction 
showing the distribution of constituent elements Mg, Ag, Cu, and Sb, indicating 
the presence of a Cu-rich region at the bottom-right of the needle-shaped 
specimen. (E and F) Spatial distribution of Ag and Cu atoms, respectively, 
indicating that Ag and Cu are enriched at both the grain boundary and phase 
boundary. (G@ and H) 1D composition profiles across the phase boundary (PB) 
and grain boundary (GB), respectively, as indicated by the arrows in (D). 


that Ni enrichment is plentiful in the matrix 
(fig. S8). Hence, the failure of pure MgNiSb 
synthesis led to the formation of a Ni-enriched 
impurity phase at the MgNiSb/MgAgsSb inter- 
face (fig. S9), resulting in high p, (fig. $10). For- 
tunately, no obvious increase in the R, curve 
appeared for the MgCuSb/MgAgSb junction 
near the interface (R, = R x S, where R and S 
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are the resistance and cross-sectional area of 
junctions, respectively. The difference in R, on 
both sides of the interface is the p,), indicat- 
ing that p. was less than 1 microhm cm” at 
room temperature (fig. S10). This value was 
very low compared with those of conventional 
Ag/MgAgSb junctions. The ultraviolet photo- 
electron spectroscopy (UPS) spectra that we 
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collected showed that the W; of p-type MgAgSb 
(4.56 eV) is lower than that of MgCuSb (4.68 eV) 
(fig. S11). Meanwhile, the band structure that 
we calculated indicates that MgCuSb is a semi- 
metal (fig. S12). Consequently, driven by the 
difference in Fermi level (chemical potential), 
electrons (minority carriers) of MgAgSb will 
move from MgAgSb into MgCuSb until the 
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Fig. 3. The evolution of inter- A 
facial microstructure and inter- 
facial contact resistivity p,. 

(A) BSE images and line scanning 
of both as-made and 12 hours- 
annealed Ag/MgAgSb junctions. 
(B) BSE images and line scanning 
of both as-made and 16 days- 
annealed MgCuSb/MgAgsSb junc- 
tions. (C) Annealing time 
dependent interfacial resistivity p, 
of both Ag/MgAgSb and MgCuSb/ 
IgAgSb junctions annealed at 

553 K. p, data of Ag/MgAgSb 
junction with the annealing time 
over 12 hours is too large (not 
shown). Here it should be noted 
that the nominal composition 

of MgAgSb-based materials for the 
IgCuSb/MgAgSb junction is 
IZAg0.37CUg1Sbo go. 


Fermi level near the MgCuSb and MgAgSb in- 
terface are equal and an equilibrium state is 
reached. As electrons migrate from MgAgSb 
into MgCuSb, more holes (majority charge 
carriers in MgAgSb) are generated and ac- 
cumulated at the MgAgSb side, leading to 
the formation of an antiblocking layer with 
high electrical conductivity at the MgAgSb/ 
MgcCuSb junction (fig. S13). Furthermore, we 
found a perfect linear relationship of -V with 
no obvious rectification characteristic for the 
MgCuSb/MgAgSb junction (fig. S14), indicat- 
ing that it was indeed an ohmic contact at 
the MgCuSb/Mg<AgSb interface. Accordingly, 
these features contributed to the low p, of the 
MgAgSb/MgCuSb junction. 

Mechanical properties of TE junctions are 
commonly influenced by the CTE mismatch 
between TEiM and TE materials (40, 41). The 
lower CTE mismatch is closely associated with 
the weaker interface thermal stress. Therefore, 
we measured the temperature-dependent CTE 
of MgAgSb and MgCuSb from 320 to 553 K 
(fig. S15), where the average CTE of MgCuSb 
(16.5 x 10°° K~}) is closer to that of MgAgSb 
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(20.0 x 10~° K"}). In addition, the differential 
scanning calorimetry (DSC) curve indicated 
that MgCuSb had no phase transition across 
the temperature range of 320 to 623 K (fig. S16). 
Therefore, we expected a stronger interfacial 
bonding with minimized stress concentration 
for the MgAgSb/MgCuSb junction. Benefiting 
from excellent ohmic contact and well-matched 
CTE between MgAgSb and MgCuSb, we con- 
cluded that MgCuSb is favorable as a TEiM 
for MgAgSb. 

To experimentally determine the potential 
reaction products between the MgCuSb and 
MgAgsSb interface, we fabricated a MgAgSb/ 
MgCuSb composite material with a nominal 
composition of MgAgp 57Cug1Sbo.99. We inves- 
tigated the related microstructure using scan- 
ning transmission electron microscopy (STEM) 
and 3D atom probe tomography (APT). The 
high-angle annular dark-field (HAADF) STEM 
image displayed abundant precipitates with a 
size of about 50 to 500 nm in diameter em- 
bedded in the MgAgSb matrix (Fig. 2A). EDS 
elemental mapping indicated that the precip- 
ates were enriched in Cu and Ag, whereas Mg 
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and Sb were distributed homogeneously in the 
entire area (fig. S17). We performed measure- 
ments to further examine the composition of two 
precipitates, confirming that the two distinct 
precipitates were mainly MgCuSb (large and 
plentiful) and Ag3Sb (small and few) (table S2). 
Figure 2B depicts a representative Cu-enriched 
precipitate surrounded by the MgAgSb matrix. 
The magnified images show a clean phase bound- 
ary between MgAgSb (bottom) and MgCuSb 
(top) phase (Fig. 2C), which was confirmed by 
the corresponding FFT (fast Fourier transform) 
patterns. 

The 3D-APT technique is capable of quan- 
tifying the composition of tiny volumes on 
the subnanometer scale with high elemental 
sensitivity of tens of parts per million (42). 
This enables an accurate understanding of the 
chemical composition of interfaces observed 
by STEM. The 3D-APT reconstruction of the 
specimen showed each element depicted by 
a specific color point (Fig. 2D) with a Cu-rich 
precipitate that can be roughly recognized 
on the bottom right of the needle-shaped 
specimen. Moreover, we can simultaneously 
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Fig. 4. Power generation A 
performance and stability of 
MgAgSb/Mg3 Bi, 5Sbo.5 TE 
modules. (A) Photographs of our 
fabricated two-pair and seven-pair 
TE modules. (B and C) The 
maximum energy conversion 
efficiency n (B) and the maximum 
output power density Py (C) of 
two-pair and seven-pair modules 
as a function of temperature 
difference AT, in comparison with 
its counterparts, such as BizTe3 
module (50), SnSe/BiTeSe 
module (51), and MgAgSb/ 
g3.2Biz 5Sbo.5 (28-30) modules. 
The same TE modules were 
measured by three independent 
esearch groups, with the 
corresponding labels as HIT-China, 
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observe the segregation of Ag and Cu atoms 
at the grain boundary and phase bound- 
ary (Fig. 2, E and F). Across the Cu-rich 
precipitate/matrix interface, we found an 
Ag- and Cu-rich region with a width of ~10 nm 
(Fig. 2G). Away from the boundary region, 
the concentration of the main element was 
close to the nominal composition of the MgAgSb 
(top left) and MgCuSb (bottom right). More- 
over, we observed a similar Ag- and Cu-rich 
layer at the grain boundary of MgAgSb (Fig. 
2H). Theoretically, the segregation of solute 
atoms at the interfaces will practically min- 
imize the Gibbs free energy of the solid 
solution (43, 44), which will lower the inter- 
face energy and make the interface structure 
more stable. Therefore, we attributed the ob- 
servation of Ag at both the phase boundary 
and grain boundary to thermodynamic Gibbs 
absorption rather than the formation of Ag3Sb 
phase. In addition, our APT data demon- 
strated that the interdiffusion between Cu 
and Ag in the matrix of these two different 
phases was very small. This ensured that the 
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MgAgsSb matrix would not be contaminated 
by the MgCuSb barrier layer. Meanwhile, this 
finite interdiffusion in the vicinity of the phase 
boundary will promote the effective adhe- 
sion of these two materials, thus ensuring a 
low contact resistance and a strong mechan- 
ical strength. On the basis of the microstruc- 
ture investigation and compositional analysis 
above, we concluded that MgCuSb could be 
formed spontaneously without the forma- 
tion of other intermediate phases, implying 
that MgCuSb has a stable two-phase equilib- 
rium with MgAgSb. 

Given that the hot side of TEG usually works 
at an elevated temperature, the heat endurance 
of the most vulnerable TEiM/TE materials junc- 
tion is an important criterion for evaluating 
stability of TE devices (36, 37, 45, 46). In ad- 
dition, although Ag was often used as a TEiM 
for MgAgSb (28, 30-32), only Ying et al. inves- 
tigated the stability of the TE module at 7), = 
500 K under argon protection (30). Consequent- 
ly, we annealed the Ag/MgAgSb and MgCuSb/ 
MgAgsSb junctions at 553 K in vacuum for sev- 
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Time (h) 


eral hours (1, 3, 6, and 12) and days (1, 3, 5, 10, 
and 16) to systematically investigate the inter- 
facial structure and p, evolution, in which the 
nominal composition of MgAgSb-based ma- 
terials for the MgCuSb/MgAgSb junction is 
MgAgo.87CUp1SDo 99. 

We found a negligible chemical diffusion 
interface of as-sintered Ag/MgAgSb junction 
for the backscattered electron (BSE) image 
and EDS line scanning (Fig. 3A). However, after 
12 hours of annealing at 553 K, the initial in- 
terface gradually grew into an Ag-rich and Sb- 
deficient zone with ~10 um in the MgAgSb 
matrix near the interface. An obvious Ag-rich 
deteriorated phase and cracks formed between 
the Ag/MgAgSb interface. Composition analy- 
ses indicated that the deteriorated phase near 
the interface was mainly AgSb (fig. S18 and 
table S3). On the contrary, except for a dif- 
fusion layer with a width of about 20 um at 
the MgCuSb/MgAgSb junction, we detected 
no deteriorated phase and/or crack even after 
annealing at 553 K for 16 days (Fig. 3B). From 
the thermodynamic viewpoint, the annealing 
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Fig. 5. The evolution of interfacial resistivity and microstructure for TEIM/TE junctions. (A) NiTe2/BiosSb;5Te3 junctions (as sintered and annealed at 473 K 
for 7 days). (B) TiSb2/ZnSb junctions (as sintered and annealed at 623 K for 7 days). (€) CoAI/CoSb3 junctions (as sintered and annealed at 823 K for 7 days). 
(D) CoAl/ZrCoSb junctions (as sintered and annealed at 923 K for 7 days). 


treatment facilitates particle precipitation and 
coarsening, leading to the appearance of par- 
ticles in the MgAgSb region. This is acommon 
transition from the nonequilibrium state to the 
equilibrium state, as we also observed numerous 
particles inside the material after annealing 
(fig. S19). However, because no deteriorated 
phases formed, element diffusion due to the 
concentration gradient will strengthen the in- 
terfacial metallurgical bonding without affect- 
ing the interfacial stability. 

We conducted measurements of p, of Ag/ 
MgAgsSb and MgCuSb/MgAgSb with different 
annealing times (figs. S20 and S21). The p, 
increased rapidly from ~6.1 to 26.0, 98.1, 418.8, 
and 1006.0 microhm cm? after annealing for 
1, 3, 6, and 12 hours, respectively (table S4 
and Fig. 3C), which was due to the formation 
of both Ag;Sb and cracks. To investigate the 
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origin of crack formation and rapidly increased 
p- of Ag/MgAgSb junctions, we synthesized 
pure and dense Ag;Sb (fig. S22 A) to mea- 
sure CTE. Our DSC curve indicated that the 
synthesized Ag;Sb had no phase transition 
over the temperature range of 320 to 573 K 
(fig. S22 B). The average CTE of Ag3Sb from 
320 to 553 K was ~24.11 x 10°° K" (fig. $22C), 
which was about a 17% mismatch with MgAgSb 
(average CTE is ~20.51 x 10° K”). Notably, the 
CTE of Ag,Sb rapidly increased to 31.8 x 10° K* 
at 550 K, much higher than the 25.8 x 10°° K of 
MegAgSb. We regarded this CTE mismatch as 
the main reason for service failure of Ag/MgAgSb 
junction. This implied that Ag was unreliable 
as the TEiM of MgAgSb in long-term service 
under high-temperature conditions. In stark 
contrast, owing to the matched CTE, suitable 
Wy, and high thermodynamic stability be- 
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tween MgCuSb and MgAgSb, the p, of MgCuSb/ 
MgAgSb remained almost unchanged and 
was less than 1 mirohm cm” even after 16 days 
of annealing at 553 K, suggesting the efficacy 
of our proposed TEiM screening strategy. Our 
experimental results were not contradictory 
with previous reports from Ying et al. (30), 
even though we used different treatment con- 
ditions (553 K versus 500 K and argon versus 
vacuum) of thermal stability checking. 
Furthermore, we constructed several Te-free 
TE modules composed of p-type MgAgSb with 
MgCuSb as the TEiM and n-type Mg3 5Bi, ;Sbo.5 
with Fe as the TEiM (Fig. 4A). We investigated 
their performance and thermal stability (de- 
tailed measurement parameters can be found 
in figs. S23 to S32). A longer leg length L will 
result in a larger real temperature difference 
(AT) across the TE module, which leads to a 
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higher n but a lower power output density. 
Therefore, we fabricated two-pair modules with 
a leg length of ~8 mm. We also prepared seven- 
pair modules with a leg length of ~3 mm be- 
cause a shorter L means a higher output power 
density and lower module cost. Because an 
accurate measurement of heat flux (47, 48) is 
challenging, conversion efficiency tests fre- 
quently involve large deviations. To address 
this, we conducted an international round- 
robin check of module efficiency testing with 
multiple laboratories in Asia, Europe, and the 
United States to verify our measurement re- 
liability. TE module performance from three 
independent groups—labeled as HIT-China, 
UH-USA, and IFW-Germany (Fig. 4, B and C)— 
was comparable. For example, at the hot side 
of ~593 K, the maximum conversion efficiency 
Nmax Of the two two-pair modules was 9.5 and 
9.0%, and the maximum output power Pynax 
was 0.37 and 0.34 W for HIT-China and UH- 
USA, respectively. In comparison to all the 
state-of-the-art TE modules, our conversion effi- 
ciency value outperformed the others at the 
same AT (28-30, 49-51). Specifically, in com- 
parison to previous reports about MgAgSb/ 
Mg;(Bi, Sb)2.-based TE modules (28-30), con- 
sidering similar theoretical n (fig. S31), the 
beneficial role of MgCuSb TEiM should be 
acknowledged. For seven-pair modules, the 
Nmax Was ~ 8.1, 7.8, and 7.8%, and the Pax 
was 0.48, 0.47, and 0.46 W, for HIT-China, UH- 
USA, and IFW-Germany, respectively. By com- 
paring the round-robin results at AT = 300 K 
(figs. S27 and S30 and tables S5 and S6), the 
relative standard deviation of Pa, Was 5.25 
and 1.29% for two-pair and seven-pair mod- 
ules, respectively, whereas the relative stan- 
dard deviation of heat flow Q was 1.30 and 
1.35%, respectively. Both in turn led to a rela- 
tive standard deviation of nNmax of ~3.82 and 
2.15%, respectively, demonstrating that our 
module measurement results were reliable. 
Such a small variation was mainly due to the 
real temperature difference across the TE mod- 
ule in the measurement apparatus, considering 
different positions of hot-side thermocouples 
as well as interfacial thermal resistance. There- 
fore, we used the average value to describe the 
power-generation performance. 

We carried out continuous measurements 
to examine the long-term reliability of modules. 
To avoid the measurement temperature near the 
phase-transformation temperature of MgAgSb 
leg, we continuously tested the TE module for 
44 hours with 7, at ~553 K and T, at ~293 K. 
Both P and n decreased slightly (~3.0% loss) 
after 44 hours of continuous measurement (Fig. 
4D and fig. $32). P-type legs exhibited good 
stability after 16-day annealing at 553 K, but 
TE performance of n-type Mg;(Bi, Sb). alloy 
tended to degrade and was associated with 
Mg loss (52). After the measurement, p,, of 
our fabricated n-type leg gradually increased 
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from 8.71 to 12.32 microhm m (near the hot side 
of the leg), leading to the increased internal 
resistance and reduced output power (fig. S33). 
Therefore, a protective coating layer is re- 
quired for n-type Mg;(Bi, Sb)z modules in power- 
generation applications. We also prepared a 
seven-pair MgAgSb/Mg3 »Bi, ;Sbo,; TE module 
without TEiM for p-type MgAgSb to evaluate 
the role of MgCuSb TEiM, which, however, 
showed poor module performance (figs. S34 
and fig. S35). Therefore, a TEiM, especially a 
stable TEiM with low p,, is essential for TE 
modules at elevated temperatures. 

To illustrate the general applicability of the 
TEiM screening mechanism in other TE systems, 
we chose several representative materials work- 
ing from low temperature to high temperature, 
such as BigTes, ZnSb, CoSbs3, and ZrCoSb. Using 
results of phase diagrams calculations, we se- 
lected NiTes, TiSbz, CoAl, and CoAl as the cor- 
responding TEiM for Bi,Te;, ZnSb, CoSbs, and 
ZrCoSb, respectively (fig. S36). These selected 
TEiMs showed a stable two-phase equilibrium 
with the corresponding TE materials without 
the formation of any possible second phase. 
For these fabricated NiTe./Bip,Sb,;Tes (Fig. 5A), 
TiSb./ZnSb (Fig. 5B), CoAl/CoSbs (Fig. 5C), and 
CoAl/ZrCoSb (Fig. 5D) junctions, we observed 
that p, is small and can be neglected. Moreover, 
after annealing for 7 days at 473, 623, 823, and 
923 K, respectively, there was no sign of any 
second phases around interfaces, and the cor- 
responding p, showed no change. Therefore, 
our proposed approach had general applica- 
bility that led to the identification of suitable 
TEiMs with high stability. In addition, we fabri- 
cated a two-pair Bio;Sb,5Te3/Mg3 Bi, ;Sbo5 TE 
module with NiTe, as the TEiM for BiSb,;Te; 
and measured the power generation perfor- 
mance at 7, = 283 K (figs. $37 and S38). A 
high ny of 7.9% was obtained at AT = 240 K 
(fig. S39), outperforming the commercial Bi,Te3. 
based TE module from KELK Ltd. (7.2%) (53) 
and other state-of-art Bi,Tes-based TE mod- 
ules (50, 54, 55). 


Conclusions 


We have proposed an effective TEiM screening 
strategy by using phase diagram calculations 
to identify suitable reaction products. By means 
of this combinatorial strategy, we identified 
MgCuSb as a reliable TEiM for the emerging 
MgAgsb material. The MgCuSb/MgAgSb junction 
exhibited low contact resistance (<1 microhm 
cm”) even after being annealed at 553 K for 
16 days, in sharp contrast to the Ag/MgAgSb 
junction, which exhibited a contact resistance 
of ~1000 microhm cm” after being annealed at 
553 K for 12 hours. Therefore, the MgAgSb/ 
Mg3 oBi, 5Sbo,5 module consisting of MgCuSb 
as a TEiM for p-legs demonstrated a high 
conversion efficiency of ~9.25% at a temper- 
ature difference of 300 K, which we confirmed 
through an international round-robin test of 
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module performance in multiple laborato- 
ries. Moreover, the general applicability of 
such a TEiM screening strategy was also shown 
with several representative TE materials. Our 
strategy provides a generally applicable path- 
way to address the bottleneck in the develop- 
ment of highly stable TEiMs for efficient power 
generation. 
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GLYCOSYLATION 


Palladium catalysis enables cross-coupling-like 
Sy2-glycosylation of phenols 


Li-Fan Deng"t, Yingwei Wang”t, Shiyang Xu’, Ao Shen’, Hangping Zhu’, Siyu Zhang?, 


Xia Zhang’, Dawen Niu’* 


Despite their importance in life and material sciences, the efficient construction of stereo-defined 
glycosides remains a challenge. Studies of carbohydrate functions would be advanced if glycosylation 
methods were as reliable and modular as palladium (Pd)-catalyzed cross-coupling. However, Pd-catalysis 
excels in forming sp@-hybridized carbon centers whereas glycosylation mostly builds sp*-hybridized 
C-O linkages. We report a glycosylation platform through Pd-catalyzed Sy2 displacement from phenols 
toward bench-stable, aryl-iodide—containing glycosyl sulfides. The key Pd(Il) oxidative addition 
intermediate diverges from an arylating agent (Csp” electrophile) to a glycosylating agent (Csp* 
electrophile). This method inherits many merits of cross-coupling reactions, including operational 
simplicity and functional group tolerance. It preserves the Sy2 mechanism for various substrates and is 
amenable to late-stage glycosylation of commercial drugs and natural products. 


he utility of glycosides in medicinal chem- 

istry, materials, and biological science 

is well appreciated (/, 2) but difficulties 

in their synthesis by glycosylation pose 

substantial obstacles to exploring their 
functions. Both reactants in glycosylation— 
the glycosyl donors and acceptors—are often 
structurally complex, and the properties of 
products are profoundly affected by the ab- 
solute configuration of glycosidic centers. There- 
fore, an ideal glycosylation method needs to 
simultaneously address chemo- and stereo- 
selectivity issues, two persistent challenges 
in synthesis. 

Glycoside synthesis has been propelled by 
the introduction of new donors and their ac- 
tivating approaches. Most reported glycosylation 
methods proceed under (Lewis) acid-promoted 
conditions, which convert glycosyl donors to 
(equivalents of) oxocarbenium ions for sub- 
sequent trapping by acceptors (3, 4). These 
techniques have been the cornerstones in car- 
bohydrate synthesis and have allowed for pre- 
paration of complex structures (5-8). However, 
controlling or predicting stereoselectivity re- 
mains nontrivial as the glycosylation mechanism 
often shifts within the Sy1/Sx2 continuum (9), 
depending on the properties of reactants and 
reaction parameters (JO). Issues can also arise 
when labile donors or harsh activating con- 
ditions are needed, complicating reaction setup. 
Among the venues to overcome these obsta- 
cles, developments include the Yu group (J), 
which exploits selective gold-alkyne interac- 
tions; the Jacobsen group (12, 13), which ex- 
plores mild hydrogen-bond catalysis; the Miller 
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group (14, 15), which harnesses the strong Ca-F 
bond-forming energy; the Nguyen group (J6) 
that employs Lewis base catalysis; and the 
Codée (17), Takemoto (18), and Loh groups (19), 
which utilize halogen-bond catalysis, and others 
based on transition metal catalysis (20, 21). We 
have reported radical activation of glycosyl 
donors (22). Despite this progress, there is 
still a high demand for methods with a gen- 
eral scope to prepare stereodefined glycosides 
in a simple and predictable manner, which 
continues to fuel mechanistic and methodo- 
logical advancements. 

The modularity and reliability inherent in 
Pd-catalyzed cross-coupling reactions have 
made them indispensable tools in organic 
synthesis (23, 24). If O-glycosylation could be 
as straightforward and robust as Pd-catalyzed 
cross coupling, the downstream exploration of 
glycosides would be greatly facilitated. However, 
Pd-catalysis is typically effective in activating and 
forging sp*-hybridized carbon centers, whereas 
glycosidic bonds are mostly sp’-hybridized C-O 
linkages. Strategies that can bridge this gap 
and channel the power of Pd-catalyzed cross 
coupling into the field of glycoside synthesis 
hold considerable potential. O'Doherty and 
others have applied Pd-catalyzed allylic substi- 
tution, followed by alkene (di)hydroxylation, 
for the de novo synthesis of O-glycosides (25, 26). 
Here, we report a Pd-catalyzed Sy2 glycosyl- 
ation method that commences with the oxida- 
tive addition (OA). The utility of this approach 
is showcased in a general and simple Sy2 gly- 
cosylation of phenols, a prominent challenge 
in O-glycoside synthesis. 


Reaction design 


General and straightforward methods for syn- 
thesizing stereodefined phenolic O-glycosides 
are highly valuable but remain nontrivial 
(14, 27, 28). Phenols—glycosylated or not—are 
abundant in both naturally occurring and man- 
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made compounds (Fig. 1A) (29). Introdu ele 
carbohydrate moieties into phenols has prt----.-— 
to be an effective approach for modifying their 
physical and biological properties in drug dis- 
covery endeavors. Glycosylation of phenols (3) 

is complicated as they exhibit modest nucleo- 
philicity compared with alcohols under acidic 
conditions (1 to 2; Fig. 1B). Moreover, phenols 
are ambident nucleophiles, potentially resulting 
in either O-glycosylated (4) or C-glycosylated 
products (5). 

Pd-catalyzed cross-coupling reactions 
(Fig. 1C, left cycle) are usually initiated by 
Pd(O)-mediated OA (6 to 7), followed by ligand 
exchange (7 to 8) and reductive elimination 
(8 to 9) to give the desired products (here O- 
arylated phenols 9). Recognizing the limita- 
tion of this cycle in activating/building Csp” 
centers (30), we designed a strategy (Fig. 1C, 
right cycle) that uses bench-stable, ortho- 
iodobiphenyl-substituted sulfides (31, 32) 
11 as glycosyl donors. The aryl iodide unit in 
11 readily undergoes oxidative addition with 
Pd(0) catalysts, forming an OA complex 12 
that acts as an effective glycosyl (Csp*) elec- 
trophile, likely driven by its tendency to un- 
dergo Csp’-S reductive elimination (indicated ‘ 
by dashed lines). Nucleophilic attack to 12 
by phenoxides 10 proceeds through a clean 
and general Sy2 mechanism, resulting in in- 
version of the glycosyl center and genera- ‘ 
tion of 13. 

As aresult of the donor activation mechanism, 
this glycosylation method exhibits a notable 
tolerance toward functional groups and al- 
lows using unprotected glycosyl donors. No 
O-arylation side products are observed from 
the process (Fig. 1C), indicating that our ap- 
proach directs the Pd-containing OA complex 
(such as 12) to transition from an arylating 
agent (Csp”) to a glycosylating agent (Csp®), « 
unveiling an unprecedented reactivity. The ‘ 
OA complex 12 behaves uniformly as an Sy2 
electrophile, from fully oxygenated to fully 
deoxygenated donors, a rarity in carbohydrate 
chemistry. This method grants access to either 
isomer of the phenolic O-glycoside products in 
a predictable manner, many of which were pre- 
viously challenging to obtain. The transforma- 
tion occurs under mildly basic conditions and 
can be performed as easily as a Pd(0)-catalyzed 
cross-coupling reaction. 


Reaction validation and condition optimization 


Our study commenced with the model reaction 
between sulfide 14 or 15 and 4-methoxy phenol 
(16) to make O-glycoside 17 or 18 (Fig. 2). The 
stereoselective synthesis of 2-deoxyglycosides 
of phenols (e.g., 17/18) has been difficult due to 
the lack of a C2-substituent as a stereodirect- 
ing auxiliary and the susceptibility of the 
2-deoxyglycoside products to acid-promoted 
hydrolysis. Guided by the reaction design 
in Fig. 1C, we established conditions to make 
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Fig. 1. Glycosylated phenols: background and synthetic approaches. (A) Representative examples of glycosylated phenols. (B) Glycosylation of phenols: 
challenges and limitations. (©) Comparison of Pd-catalyzed C—O cross-coupling and Pd-catalyzed Sy2 glycosylation (this work). FG, functional group. 


O-glycosides 17/18 from 14/15 in high yields, 
with dibenzothiophene (19) formed as a 
byproduct. The B-O-glycoside 17 was obtained 
from the a-S-glycoside 14, and a-O-glycoside 18 
formed if B-S-glycoside 15 was employed. The 
clean inversion of the glycosidic centers in 14/15 
suggests that this Pd-catalyzed glycosylation pro- 
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ceeds by means of an Sy2-type mechanism. Gen- 
eral and operationally simple Sy2-glycosylation 
methods remain rare, despite their high value 
as tools (12, 34-36) to access stereodefined 
glycosides. Even scarcer are methods that can 
afford both stereoisomers, as accessing the cor- 
responding donors with defined and stable con- 
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figurations is not always simple. In our case, 
sulfide donors (with general structure 11) could 
be prepared from the corresponding 1-glycosyl 
thioacetates in one pot at decagram scales (see 
SM section 3) and stored for months without 
precautions to avoid air or moisture. Employ- 
ing the Pd(0)-mediated OA as a donor-activating 
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SolB 


“i 


14 (1.0 equiv.) 


16 (1.5 equiv. 


15 (1.0 equiv.) 


Pd(0)-Xantphos (2 mol%) / 
KzCO3 (2.0 equiv.) 
| Toluene (0.1 M), 60 °C Ss 
“Th 
OMe 


OBn 


BnO 
BnO 


ane 


17 (95% yield, a:B < 1:19) 


OBn 


0) 
| Il 
OMe 


18 (90% yield, «:B > 19:1) 


Divation from standard Conditions 


Entry (with 15 used) Conversion Yield (a:) 
1 none 100% 92% (>19:1) 
2 no Pd(0)-Xantphos 0% 0% 
3 no KsCO3 7% 5% (>19:1) 
4 Cs2COz3 instead of KoCO3 100% 94% (>19:1) 
5 TEA instead of KxCO3 45% 20% (>19:1) 
6 TMG instead of Kx>CO3 79% 54% (2:3) 
7 KCO3 aq. 87% 71% (9:1) 
8 room temperature 21% 19% (>19:1) 
9 MeCN instead of Toluene 41% 40% (1:1) 
10 no XantPhos 77% 75% (>19:1) 
11 NiXantPhos instead of XantPhos 100% 85% (4:1) 
12 DavePhos instead of XantPhos 67% 56% (>19:1) 
13 dppb instead of XantPhos 100% 90% (3:1) 
14 Addition of TEMPO 100% 91% (10:1) 
Ligands Byproduct 
H 
oe e : S (ik PhzP., ,. PPho () 
GE BS ~e G, 
PPh, PPh, PPh, PPh, 
Xantphos Ni-Xantphos Davephos dppb 19 


Fig. 2. Reaction validation and condition optimization. Reactions in this table were performed at 0.05 mmol 
scale at 0.1 M concentration for 24 hours. Yields and conversions are determined by “H NMR analysis using 1,3,5- 


trimethoxy benzene as an internal standard. Diastereom 


eric ratios were determined by NMR analysis of crude 


reaction mixture. TMG, 1,1,3,3-tetramethylguanidine; TEMPO, (2,2,6,6-tetramethylpiperidin-1-yl)oxyl; Bn, benzyl. 


approach, this method provides either isomer of 
the phenolic O-glycosides. The reaction proceeds 
at 60°C and only requires K,COs; as base and a 
catalytic amount (2 mol%) of Pd(0) catalyst as- 
sembled from Pd(dba). and Xantphos. 

We conducted control experiments to iden- 
tify factors influencing the performance of this 


of Pd(O) catalyst (entry 2) or a suitable base 
(entry 3). Inorganic bases K,COs (entry 1) and 
Cs.COsz (entry 4) exhibited better effects than 
organic bases such as Et3N (entry 5) and TMG 
(entry 6). A saturated aqueous solution of K,CO; 
could be used (entry 7), suggesting that the re- 
action has a good tolerance to water. When the 


method. Little reaction occurred in the absence 
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reaction is conducted at room temperature 
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(roughly 25°C), the conversion is relatively low 
within the same time period (entry 8). The use 
of more polar solvents such as MeCN greatly 
eroded the stereochemical purity of products 
(entry 9). The added ligands exert pronounced 
effects on the reaction efficiency (entries 10 to 
13), likely through modulating the properties 
of OA complex such as 12 in Fig. 1C. Unlike 
phosphine-based ligands, nitrogen-based ligands 
and N-heterocyclic carbene ligands we tried 
were less effective (see SM, section 4). Addition 
of a stoichiometric amount of TEMPO was fully 
tolerated, essentially excluding a radical-based 
mechanism (entry 14) (22, 37). 


Substrate scope 


It is rare for a glycosylation method to hold 
high efficiency for a broad array of substrates 
as the reaction mechanism (and outcome) often 
varies with the reactivities of donors and ac- | 
ceptors (38-41). Our method accommodates 
diverse glycosyl units (Fig. 3), providing ei- 
ther of the two possible stereoisomers with high 
purities (22a-m). For example, donors bearing 
benzyl (22a, 22k, 221), acetyl (22b, 22f), 
benzylidene (22d) and silyl-protecting groups 
(22c, 22e) could all be used. Various other 
2-deoxypyranosyl groups (22f-j) were in- 
stalled with similar efficiencies. Our meth- 
od could be adopted to construct the more 
electron-rich furanosyl linkages: both iso- 
mers of 2-deoxyribosides were generated 
cleanly (22k). This method is not limited to 
2-deoxy sugars, and the Sy2 mechanism op- 
erates with fully oxygenated (221 and 38g/h 
in Fig. 4) and fully deoxygenated tetrahydro- 
pyranyl (22m) donors. 2-O-acetyl or 2-N-acetyl 
protected donors were unsuccessful, likely be- 
cause of interference from these neighbor- 
ing participating groups (fig. S6). Aliphatic 
alcohols are almost inert under the current 
conditions, allowing the use of unprotected 
glycosyl donors, as shown by examples 22¢-j. 
The results attest to the exceptional func- 
tional group tolerance of this OA-initiated 
glycosylation method. It is worth highlight- 
ing that many of the glycosyl units in Fig. 3 
are deoxygenated and electron rich. To obtain 
the corresponding O-glycosides with high 
stereochemical purities would be tedious 
by conventional methods, due to dearth of 
suitable donors, lack of stereo-directing auxil- 
jiaries, and susceptibility of products to acid- 
promoted hydrolysis. 

Both electron-rich (22w, 22ae, 22ah) and 
electron-deficient phenols (22n-r) were ac- 
commodated, with no product arising from 
C-glycosylation observed. Phenols bearing an 
ortho-substituent were competent substrates 
(22u-w, 22y-z). Functional groups such as 
aldehydes (22q), esters (22r), secondary amides 
(22af), nitriles (22p), and ketones (22x) were 
tolerated. The terminal alkene group in 22v did 
not isomerize to conjugate with a phenyl ring. 
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Fig. 3. Substrate scope. Unless otherwise noted, reactions in this table were performed at 0.1 or 0.2 mmol scale in toluene (0.1 M) for 24 hours, using Pd(PPh3)4 
(2 mol%), Xantphos (4 mol%), and K2CO3. Isolated yields are reported. Diastereomeric ratios were determined by NMR analysis of crude reaction mixture. The 
‘reaction was run at 80°C; >Ni-XantPhos was used instead of XantPhos; °Cs,CO3 as base. See SM, section 5 for experimental details. 


Potentially chelating heterocycles including 
dioxolanes (22t), thiazoles (22aa), quinolines 
(22ab), pyridines (22ac), oxazoles (22ad), and 
morpholines (22ae) were incorporated. Protic 
hydrogen atoms in secondary amides (22af) 
were compatible. Tyrosine derivatives could 
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be glycosylated (22af), and the o-stereocenter 
in the amino acid backbone stayed intact. Aryl 
bromides/chlorides (22n, 22u, 22y-z) did 
not interfere with this method, likely because 
the oxidative addition to aryl iodide units in 
donor 14 or 15 is a faster process. The remain- 
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ing aryl halide groups could serve as handles 
for further derivatizations (see below). Partic- 
ularly noteworthy is that aryl boronic esters 
(22s) survived the reaction conditions with- 
out undergoing a Suzuki-Miyaura reaction, 
highlighting the distinctive reactivity of our 
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Fig. 4. Synthetic application. (A) Glycosylation of natural products and commercial drugs. (B) One-pot, multistep, multicomponent reactions. Reactions in this table 
were performed at 0.1 or 0.2 mmol scale in toluene (0.1 M) for 24 hours, using Pd(PPh3)4 (2 mol%), Xantphos (4 mol%), and CszCO3. Isolated yields are reported. 


Diastereomeric ratios were determined by NMR analysis. 


Pd) OA complexes (i.e., 12 in Fig. 1C) as 
glycosyl (Csp”) electrophiles. 


Synthetic application 
To demonstrate the utility of this method, we 


ucts and commercial drugs (Fig. 4A). Simple 
phenols such as vanillin (25) and triclosan (34) 
were glycosylated smoothly. Internal alkene 
groups are accommodated, as exemplified by 
the formation of 30-31 and 35. The tertiary 


applied it in the modification of natural prod- 
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amine groups in drug molecules may inter- 
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’K,CO3 used as base; "Reaction performed at 100 °C. See SM sections 6 and 7 for experimental details. 


fere with acid-promoted glycosylation meth- 
ods, but show excellent compatibility with our 
conditions (26-27, 33). Under our conditions, 
protection of aliphatic alcohols is unnecessary 
(29, 33). In the case of chrysin (23), the C7- 
OH is glycosylated with high regioselectivity. 
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Fig. 5. Mechanistic studies. (A) OA complex 45, formation, isolation, structure and reactivity. (B) Computed energetics for potential reaction pathways. Free 
energies are computed by B3LYP-D3/def2-tzvp/SMD(toluene)//B3LYP-D3/def2-svp. (€) Reactivity of various phenols. Procedures to determine relative rate k,e): 
phenol (0.1 mmol), substituted phenol 48 (0.1 mmol) and glycosyl! donor 15 (0.1 mmol) are allowed to react under standard conditions and stopped at ~30% 
conversion. k;e), conversion of substituted phenol divided by conversion of phenol. The results are the average of three runs. Error bars represent standard deviations. 
See SM section 8 for experimental details. 


For polyphenols such as icaritin (37), two gly- | 2-hydroxylanthraquinone is a competent sub- | coumarins are glycosylated cleanly (32). Ster- 
cosyl units can be installed at atime, both with | strate in our reaction (24). O-glycosylated cou- | iodal phenols such as estrone (28) and estradiol 
high selectivities. Glycosylated anthraquinones | marins are frequently employed as fluorescent | (29) are modified. A glycosyl unit was installed 
are ubiquitous in nature and we show that | probes for detection of glycosidases (42), and | onto ezetimibe (36), a cholesterol absorption 
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inhibitor containing a sensitive B-lactam unit. 
SN-38 is a potent anticancer agent derived from 
camptothecin (43) but is poorly soluble in wa- 
ter. Conjugation with a sugar may improve the 
pharmacological profile of the parent molecule; 
see Afeletecan in Fig. 1A. We mounted an array 
of glycosyl units, including both C2-deoxygenated 
(88a-f) and C2-oxygenated sugars (38g-h), onto 
an SN-38 derivative (Fig. 4A). 

Pd-catalyzed Sy2 glycosylation could be 
performed in tandem with other Pd-catalyzed 
cross-coupling reactions, affording carbohydrate- 
containing compounds by one-pot, multistep 
processes in which a single Pd(0)-complex 
catalyzes two distinct steps (Fig. 4B). Mixing 
glycosyl donor 14 or 15, m-bromo phenol (39), 
and aryl boronic ester in one pot with a Pd(0)- 
catalyst and a base, the Pd-catalyzed glycosyla- 
tion and the Suzuki-Miyaura coupling proceed 
in sequence, to give loratadine derivative 41 
or L-tyrosine derivative 40 in useful overall 
yields. The glycosylation could also be merged 
with Buchwald-Hartwig (44, 45) coupling if 
the ligand is switched to Brettphos (46) and 
m-toluamide as the third coupling partner; 
consecutive formation of Csp*-O and Csp°-N 
bonds delivered 42 smoothly. Lastly, a sequence 
composed of glycosylation/Sonogashira cou- 
pling provides ethisterone-sugar conjugate 
43 in 61% overall yield. These examples il- 
lustrate selectivity of our Pd-catalyzed glyco- 
sylation and suggest a rapid approach to make 
glycoconjugates. 


Mechanistic studies 


Acetyl-protected sulfide donor 44 reacted with 
Pd(PPh3),, smoothly, and the OA complex 45 
could be isolated from the mixture by column 
chromatography (Fig. 5A). The facility of the 
OA step may be attributable to the sulfur atom 
in 44, which could pre-coordinate with Pd(0). 
Complex 45 is a crystalline solid and its mo- 
lecular structure was verified by x-ray crystal- 
lography. Similar to other classical Pd(ID) OA 
complexes (47), the Pd center in 45 adopts a 
(slightly twisted) square planar configuration, 
with the two neutral ligands (i.e., sulfur and 
phosphine atoms) occupying para positions. 
From the solid-state structure of 45, we noted 
that the C1-S bond is slightly elongated (1.83 to 
1.87 A) and the C1-O bond is shortened (1.41 to 
1.39 A) from their normal lengths (48), indicating 
buildup of a positive charge at the C1 position. 

Treating 45 with 4-methoxy phenol (16) in 
the presence of K,CO, afforded 22b with de- 
cent efficiency and inverted configuration. Ad- 
dition of external ligand Xantphos gave similar 
results. A small amount of 45 (2 mol%) could 
catalyze the reaction between 44 and 16 to 
form 22b. These results support 45 as a re- 
active intermediate in our process. 

We next performed DFT calculations (B3LYP- 
D3/def2-tzvp/SMD(toluene)//B3LYP-D3/def2-svp) 
employing 45 and phenoxide 46 (or cesium 


Deng et al., Science 382, 928-935 (2023) 


24 November 2023 


phenoxide, see SM section 9) as model sub- 
strates (Fig. 5B). Two potential pathways were 
examined: In principle, reductive elimination 
of C-S bond in 45 to yield sulfonium Int-I (by 
means of TS-I), followed by phenoxide attack 
from 46 could afford glycoside 47. Alterna- 
tively, direct attack of phenoxide 46 toward 
45 by means of TS-II, followed by reductive 
elimination of the C-S bond (45) in Int-II 
would provide #7 as well, affording 19 as a 
byproduct and regenerating Pd(0) catalyst. 
Upon comparing the free energies of species 
TS-I and TS-II (note: with different charge 
states), we observed that the pathway through 
TS-II exhibits lower barriers. Presumably, co- 
ordination of the sulfur atom to the Pd center 
polarized the C-S bond in 45 and made it 
electrophilic enough toward phenoxide attack. 
We also considered a scenario where the iodine 
anion dissociates early from 45 before Sy2 dis- 
placement by 46 (see SM, section 9). 

We also compared the relative reactivities 
of various phenols bearing different para- 
substituents (48) in our glycosylation reaction 
(Fig. 5C). In internal competition experiments, 
those with electron-withdrawing substituents 
(49) react at faster rates, presumably because 
they are more easily deprotonated under basic 
conditions. By external competition experi- 
ments, we found that the turnover frequency, 
as inferred from the conversion of 15, increases 
with the electron-withdrawing ability of the 
para-substituent. The absence of phenol 48 
resulted in essentially no reaction. These re- 
sults indicate the involvement of phenoxide 
nucleophiles in the turnover-determining step 
and in turn lend some further support for 
the pathway through TS-II. Although addi- 
tional experiments are warranted to elucidate 
mechanistic details, the ability of Pd-containing 
OA complex to serve as a glycosyl (Csp”) elec- 
trophile was quite general (see examples in 
Figs. 3 and 4). 


Conclusions 


We developed a strategy that exploits Pd(0)- 
mediated oxidative addition—the initial step 
in classical cross-coupling reactions—as a tool 
to activate glycosyl donors. The key to this 
success was the use of aryl iodide-containing 
glycosyl sulfides as donors, which upon re- 
action with Pd(0)-catalysts furnished Pd(ID- 
containing OA complexes that act as glycosyl 
(Csp”) electrophiles. The following glycosidic 
bond-forming stage caused clean inversion of 
the glycosyl centers in donors, and either ste- 
reoisomer could be obtained in a predictable 
manner. This approach enabled a general meth- 
od for Sy2-glycosylation of phenols, allowing 
for the synthesis of phenolic O-glycosides that 
were previously challenging to access. 

The mechanism grants this reaction oper- 
ational simplicity and functional group tol- 
erance. No acid or cryogenic conditions are 
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required, and the reaction can be set up sim- 
ilarly to other Pd-catalyzed C-O cross-coupling 
reactions. Moreover, the method is amenable 
to late-stage glycosylation of a wide range of 
commercial drugs and natural products. The 
generality and mildness of the method is fur- 
ther showcased in several one-pot, multistep, 
multicomponent reactions. We anticipate that 
this study will bring opportunities in Pd-mediated 
glycosylation reactions, enabling advancements 
in carbohydrate synthesis and its application in 
various fields. 
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Air channels create a directional light signal to 
regulate hypocotyl phototropism 


Ganesh M. Nawkar'++, Martina Legris't, Anupama Goyal’s, Emanuel Schmid-Siegert”9], Jérémy Fleury’, 
Antonio Mucciolo*, Damien De Bellis**, Martine Trevisan’, Andreas Schueler*, Christian Fankhauser’* 


In plants, light direction is perceived by the phototropin photoreceptors, which trigger directional growth 
responses known as phototropism. The formation of a phototropin activation gradient across a 
photosensitive organ initiates this response. However, the optical tissue properties that functionally 
contribute to phototropism remain unclear. In this work, we show that intercellular air channels limit light 
transmittance through various organs in several species. Air channels enhance light scattering in 
Arabidopsis hypocotyls, thereby steepening the light gradient. This is required for an efficient 
phototropic response in Arabidopsis and Brassica. We identified an embryonically expressed ABC 
transporter required for the presence of air channels in seedlings and a structure surrounding them. 
Our work provides insights into intercellular air space development or maintenance and identifies a 


mechanism of directional light sensing in plants. 


he capacity to detect light direction is 

shared by many organisms, including cy- 

anobacteria, unicellular protists, animals, 

and plants (/-3). In addition to a photo- 

receptor, directional light sensing often 
requires shielding pigments (3), whereas in 
cyanobacteria, optic features of the cell focus 
the light (lens effect) to provide a directional 
cue (2). In flowering plants, light direction is 
sensed by the blue light-absorbing photo- 
tropin photoreceptors (phototropin 1 and 2 in 
Arabidopsis) (4). This typically leads to growth 
toward the light, or positive phototropism, in 
aerial organs such as hypocotyls and stems (5). 
This response is believed to contribute to the 
optimization of photosynthesis, particularly in 
limiting light conditions (6-8). In dicotyledons 
like Arabidopsis thaliana, the upper hypocotyl 
is both the site of light perception and the site 
of differential growth, which ultimately leads 
to to organ repositioning (9, 10). Upon unilat- 
eral blue light irradiation, differential photo- 
tropin activation between the lit and the shaded 
side of the seedling is considered to be the first 
step triggering phototropism (4, 11). Substan- 
tial progress has been made in elucidating the 
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downstream steps that link phototropin acti- 
vation and the differential growth response (4). 
However, the optical features of light-sensing 
tissues that enable the formation of a light 
gradient required for a phototropic response 
remain poorly characterized (J). 


Transparency of hypocotyls causes 
phototropic defects 


In a screen for Arabidopsis seedlings with re- 
duced phototropism, we identified a mutant 
with transparent hypocotyls (Fig. 1). The causal 
gene was mapped to ATP-BINDING CASETTE 
G5 (ABCG5), which was confirmed by compar- 
ing the phenotype of multiple alleles and by 
complementation (fig. S1). We hypothesized 
that the phototropic defect in the mutant was 
caused by enhanced light transmission and a 
shallower light gradient in the upper part of 
the hypocotyl. To test this hypothesis, we com- 
pared abcg5-5 (hereafter referred to as abcg5) 
and two previously identified cristal mutants 
(cri7 and cri8), which also have transparent 
hypocotyls (12). The defective gene in these 
mutants is not known (12), but they are not 
allelic (12), and we found that the ABCG5 gene 
was unaltered in these mutants (see materials 
and methods). The three mutants had sim- 
ilarly enhanced light transmission in the hypo- 
cotyl (Fig. 1A). In response to low unilateral 
blue light, abcg5, cri7, and cri8 growth orien- 
tation was random, whereas the wild type (WT) 
aligned with the light source and photI was 
unresponsive (Fig. 1B). To determine whether 
these mutants have a specific hypocotyl tropism 
defect, we analyzed their gravitropic response, 
which also relies on asymmetric growth caused 
by redistribution of the growth hormone auxin 
(73). An auxin-transporter mutant deficient in 
three PIN-FORMED genes (pin3, pin4, and 
pin7) showed the expected inability to reor- 
ient its hypocotyl upon gravistimulation (Fig. 
1C). By contrast, the response of abcg5 and cri7 
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0 


was similar to that of the WT, whereas | Chec 


showed a reduced response to gravity (Fig/--,.— 
Both phototropism and gravitropism depend 
on growth; we therefore measured hypocotyl 
elongation during the phototropic experiment 
and found that abcg5 hypocotyls grew like the 
WT, whereas cristal mutants showed a growth 
defect that potentially explained the gravitropic 
defect of cri8 (fig. S2). We conclude that having 
a transparent hypocotyl correlates with an al- 
tered ability to respond to light direction, and 
we pursued our study with the abcg5 mutant 
because of the pleiotropic nature of cristal 
mutants (72). 

The phototropic defect of abcg5 was ob- 
served at low and 100 times higher fluence 
rates, illustrating the importance of ABCGS in 
this process (fig. $3, A and B). If the abcg5 
phototropic defect was due to impaired light 
direction sensing, we would expect that it re- : 
quires active phototropins. Consistent with this 
hypothesis, the photiabcg5 double mutant be- 
haved like a phot mutant in response to low 
blue light (fig. S3C), whereas many abcg5 mu- 
tant seedlings grew in the opposite direction 
(Fig. 1 and fig. S3C). These experiments were 
performed with seedlings grown on vertical ‘ 
plates and in contact with the media. Because 
agar scatters light and the plants were grow- 
ing at the media-air interface, we tested the 
effects of a simpler light environment. The + 
following experiments were performed on 
free standing etiolated seedlings to obtain a 
simpler directional light cue (see materials 
and methods). To better characterize the photo- 
tropic phenotype of the abcg5 mutant, we 
analyzed pulse-induced first positive photo- 
tropism, which corresponds to the conditions 
where light fluence (umol m‘”) is proportional 
to the phototropic response, whereas further 
increasing light fluence inhibits the response . 
(14). We irradiated etiolated seedlings witha ‘ 
1-min blue light pulse of various intensities 
(1.7, 0.17, 0.017, and 0.0017 umol m~? s~’). In 
agreement with previous reports (15, 16), WT 
plants showed a bell-shaped fluence-response 
curve. However, abcg5 mutants did not respond 
to this light cue (fig. S3D). 

Next, we examined phototropism by irradi- 
ating etiolated seedlings continuously with 
different blue light fluences (0.025, 0.125, and 
2.5 umol m™~ s-’) and recorded growth re- 
orientation over 6 hours. Confirming earlier 
work (16), we found that WT plants took longer 
to reach maximum curvature with increasing 
light intensity. The abcg5 mutant took much 
longer than the WT to reach maximum cur- 
vature at all tested light intensities (fig. S3E). 
We used a bidirectional light treatment to fur- 
ther test the ability of seedlings to respond to 
complex light environments. In such condi- 
tions, WT plants grew toward the stronger 
light source (7). In these experiments, seedlings 
received a similar light intensity but different 
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light gradients depending on their position 
(P): steep (P1 and P6), medium (P2 and P5), or 
shallow (P3 and P4) (fig. S3F). The ability of 
abcg5 mutants to grow toward stronger light 
was reduced particularly when the gradient 
was shallow (fig. S3G). Moreover, their re- 
sponse was slower in this situation (fig. S3H). 
The WT reoriented more slowly in bidirectional 
light treatments than with unilateral lighting 
(fig. S3, E and G), indicating that when the 
light gradient across the hypocotyl was reduced, 
the WT also delayed the phototropic response. 
Collectively, our phototropism experiments in- 
dicated that abcg5 mutants showed a photo- 
tropic defect under all conditions tested. The 
phenotype was particularly pronounced in re- 
sponse to very low light fluences, high fluence 
rates, and in complex light environments, such 
as on the surface of plates or with bidirectional 
irradiation. 

Despite these obvious phototropic defects, we 
found that photi-mediated phosphorylation events 
that occurred within minutes of light percep- 
tion were not altered in the abcg5 mutant and 
occurred with a timing similar to what has 
been previously reported (fig. S4A) (17-19). In- 
deed, the blue light-induced mobility shifts of 
phot! and its targets NPH3 and PKS4 observed 
on SDS-polyacrylamide gel electrophoresis 
were normal in abcg5. However, formation of 
an auxin gradient across the hypocotyl, as in- 


A 


WT 


% Transmitted light 


Fig. 1. Transparent hypocotyl mutants are specifically impaired in the percep- 
tion of light direction. (A) (Left) Hypocoty! transmittance quantified from 
bright-field microscopy images of 3-day-old etiolated seedlings. (Right) Representative 
images used for quantification. Scale bar, 100 um. Transmittance was quantified as 
the ratio between the mean gray value of a circular region of interest (ROI) on 

the hypocotyl and a ROI! in an empty region of the image (yellow circles). Data 

are mean + SEM of n = 10 seedlings. *P < 0.0001 in analysis of variance 
(ANOVA) followed by Dunnett's multiple comparisons test. (B) Radar plots 
showing the results of phototropic assays. 3-day-old etiolated seedlings growing 
on vertical plates were treated with unilateral blue light (BL; 0.025 umol m™? s}, 
represented by the blue arrow) for 24 hours and final growth angle relative 
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ferred with an auxin-signaling input reporter 
(8), was defective in the abcg5 mutant (fig. 
S4B). This is consistent with the abcg5 photo- 
tropic defect being due to a reduced ability to 
establish a light gradient across the hypocotyl 
rather than in early phototropin signaling steps 
(fig. S4) (19-21). Moreover, the defective blue 
light response of abcg5 was specific to photo- 
tropism as blue light-induced inhibition of hypo- 
cotyl elongation was unaltered in abcgé (fig. 
S5A). Transparency of the abcg5 mutant was 
restricted to embryonic organs and observed in 
roots, hypocotyls, and cotyledons (fig. S5B). 
Despite retarded growth (22) and the low sur- 
vival rate of light-grown seedlings, true leaves 
and other plant organs developed similarly to 
that of the WT and were not transparent (fig. 
S5, B and C). This allowed us to test the photo- 
tropic response in petioles. abcg5 mutant petioles 
showed a WT response (fig. S5D), further cor- 
relating the phototropic defect of the mutant 
with transparent organs. Analysis of ABCG5 
gene expression with a reporter line showed 
that the gene was strongly expressed in de- 
veloping embryos, with highest expression in 
the cortex (fig. S6A), whereas we did not detect 
ABCG5 in seedlings expressing pABCG5:GFP- 
ABCG5 or significant expression of the afore- 
mentioned reporter line. This is consistent 
with previous reports that showed that ABCG5 


is expressed in the developing embryo (23-25) 


BewtT phot 


abcg5 
0 0) 0) 


a 


24 November 2023 


https://avxhm.se/blogs/hillO 


(fig. S6, B, C, and D). We could complement the 
mutant by expressing the GFP-ABCGS5 trans- 
gene from the ABCG5 promoter. However, 
complementation was unsuccessful when the 
construct was driven by the viral 35S promoter 
(fig. S6E). One possible explanation for this 
finding is that the 35S promoter does not drive 
gene expression during the early stages of em- 
bryogenesis, as reported in other species (26, 27). 
The ABCGS expression pattern correlates with 
the phenotype of abcg5 mutants that is restricted 
to embryonic organs (fig. $5). The function of 
genes closely related to ABCGS is currently un- 
known (28, 29). Context-dependent redundancy 
between members of this clade may explain the 
seedling-specific phenotype of abcg5. We con- 
clude that abcg5 has a specific phototropic 
defect in seedlings that is most likely due to 
hypocotyl transparency. 


ABCG5 is required for intercellular air channels 


ABCG5 is required for cuticle development in 
cotyledons, with the mutant showing higher 
permeability of cotyledons in light-grown seed- 
lings (22). Thus, we evaluated the hypothesis 
that a defect in cuticle development in the 
hypocotyl explains the phototropic defect and 
hypocotyl transparency in etiolated seedlings. 
We assessed the hypocotyl’s cuticle structure 
by transmission electron microscopy (TEM). We 
did not find large differences between WT ° 
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to vertical was measured. The length of the blue bars represents the 
frequency of bending angles in 10° intervals. Data distribution for each genotype 
was compared to the WT with a Kolmogorov-Smirnov test (P values below each 
graph). Nwr = 42, Npnott = 32, Nabegs = 42, Neri7 = 45, aNd Nerig = 28. (C) Radar plots 
showing the results of gravitropic assays. 3-day-old etiolated seedlings growing on 
vertical plates were rotated by 90° and growth angle relative to horizontal was 
measured 24 hours later. The length of the red bars represents the frequency of 
bending angles in 10° intervals. Data distribution for each genotype was 
compared to the WT with a Kolmogorov-Smirnov test (P values below each graph). 
Nwr = 49, Nping.4,7 = 41, Nabogs = 43, Neriz = 35, aNd Norig = 32. The red arrow shows 
the direction of the gravity. vector (g). 
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and abcg5; although, as reported for the co- 
tyledons (22), the cuticle in abcg5 was slightly 
less compact than in the WT (fig. S7A). How- 
ever, it was still functional as the toluidine 
blue cuticle permeability test showed that in 
etiolated seedlings, abcg5 and WT had a similar 
permeability, which contrasts with the previ- 
ously reported cuticle phenotype of the lacs2 mu- 
tant (30) (fig. S7B). Moreover, despite cuticular 
defects, hypocotyls of Jacs2 mutants were not 
transparent (fig. S7C) and had a robust photo- 
tropic response (fig. S7D), indicating that 
hypocotyl transparency and the phototropic 
defect in abcg5 are presumably unrelated to 
the cuticle. TEM images also showed that hy- 
pocotyls cell walls had the same thickness in 
abcg5 and the WT, indicating that enhanced 
light transmission cannot be explained by thin- 
ner cell walls in the mutant (fig. S8A). Light 
absorbing pigments were proposed to con- 
tribute to light gradient formation across pho- 
tostimulated plant tissues, thereby enabling 
phototropism (3, 32). We therefore analyzed 
the absorption spectrum of soluble crude ex- 
tracts from etiolated seedlings and found that 
abcg5 extracts showed an absorption spec- 
trum comparable to that of the WT (fig. S8B). 
We conclude that hypocotyl transparency in 
the abcg5 mutant is not due to a difference 
in the cuticle, cell wall thickness, or soluble 
pigments. 

Light-grown abcg5 seedlings sink in water, 
suggesting that they contain less air than the 


Fig. 2. The difference between WT and abcg5 mutant hypocotyls is the pres- 
ence of air channels. (A) TEM images of intercellular spaces (is) in WT and abcg5. 
(Inset) Close up of a corner. Scale bar, 1 um. Black arrowheads mark the electron- 
dense layer in the WT, and pink arrowheads mark similar positions in abcg5. (B) (Left) 
Cryo-SEM images of freeze-fractured transverse cuts of 3-day-old etiolated hypocotyls 
(top) or close-up looks of intercellular spaces (bottom; scale bar 1 um). Black arrows mark 
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WT (22). We performed a floating assay with 
etiolated seedlings and found that whole seed- 
lings as well as dissected roots and hypocotyls 
of the abcg5 mutant sank more frequently than 
the WT, which was indicative of reduced air 
content in the abcg5 mutant (fig. S9A). This 
phenotype may already be present in embryos, 
given that a floating assay with dissected em- 
bryos was consistent with a higher density of 
abcg5 than WT embryos (fig. S9B). Air chan- 
nels were previously observed in embryos and 
hypocotyls of several species and are present 
in intercellular spaces at the tricellular junc- 
tions between cortex cells or between cortex 
and epidermal cells (33-35). Thus, we analyzed 
transverse cuts of etiolated hypocotyls using 
cryo-scanning electron microscopy (cryo-SEM) 
and TEM. Both in WT and abcg5 hypocotyls, 
we observed intercellular spaces at the tricel- 
lular junctions formed by epidermal and cortex 
cells. However, those spaces appeared to be 
empty in the WT, whereas they were filled in 
the abcg5 mutant (Fig. 2, A and B, and figs. S10 
and S11). The fracture pattern observed from 
cryo-SEM images shows that the intercellular 
space in the abcg5 mutants is distinct from 
cell walls (fig. S10). Using TEM tomograms, 
we observed a fibrillar network in the inter- 
cellular space of abcg5 mutants (fig. SI1B and 
movies S1 and S82). In the WT, a well-defined 
electron-dense layer lined the outer side of the 
cell wall surrounding the intercellular spaces, 
whereas in abcg5 mutants, this layer was diffuse, 


heterogeneous, and sometimes absent (Fig. 2A 
and fig. S11). Moreover, using TEM we could show 
that abcg5 already had an intercellular space 
phenotype in the embryo (fig. S11C). To further 
investigate whether the difference between WT 
and abcg5 hypocotyls was the presence of air in 
intercellular spaces, we used three-dimensional 
(3D) nondestructive x-ray microtomography 
(34, 35). We detected air channels in the lon- 
gitudinal direction in the WT, but not in abcg5 
mutants (Fig. 2C and fig. S12). Collectively, our 
data suggest that the difference in light trans- 
mission between the WT and abcg5 may be ex- 
plained by the presence of air in the intercellular 
spaces of the WT, but not in the mutant. 


Air channels in different organs from several 
species limit light transmission and contribute 
to phototropism 


To determine the role of these air channels _ 
more broadly, we water infiltrated Arabidopsis 

hypocotyls, leaves (petioles and blades), Brassica 
rapa hypocotyls, and Brachypodium distachyon 
coleoptiles. Consistent with earlier reports that 
used hypocotyls from several species (36), this 
treatment enhanced light transmission in all 
cases. This includes coleoptile tips, which are 
the site of light perception for phototropism in 
grasses and were previously shown to contain 
intercellular air spaces (1, 37) (fig. S13). In 
leaves, water infiltration had a stronger effect 
than pigment clearing on light transmission, 
including in the blue range of the spectrum 
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% Filled intercellular spaces 


empty intercellular spaces. Pink arrows mark filled intercellular spaces. ep, epidermis; 
oc, outer cortex; ic, inner cortex; en, endodermis; st, stele; is, intercellular space. 
(Right) Quantification of images. Data are mean + SEM. n = 5 hypocotyls. ****P < 
0.0001 in an unpaired t test. (C) 3D representations of x-ray microtomography images 
of WT and abcg5 hypocotyls. Black, background; white, air. Green lines show the 
analyzed volume. The numbers in green are dimensions in micrometers. 
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(fig. $13, F and I). This shows that, even in pig- 
mented tissues, intercellular air contributes to 
limiting transmission of light trough the tissue 
(32). To test whether air channels are neces- 
ary for a phototropic response, we compared 
growth, gravitropism, and phototropism in 
mock and water infiltrated Arabidopsis and 
the more than 10 times larger Brassica seed- 
lings and found that in both species, water 
infiltration reduced phototropism but not 
gravitropism, which could not be explained by 
growth inhibition (Fig. 3 and fig. S14). Taken 
together with our analysis of the abcg5 mutant, 
our data indicate a broad requirement for in- 
tercellular air channels in seedling phototropism. 


Air channels contribute to the formation of 
directional light cues 


To better characterize the optical properties of 
etiolated seedlings, we used an integrating 
sphere that allowed measurements of total 
transmittance and reflectance along with light 
scattering, i.e., diffused transmittance and re- 
flectance. As observed with our light micros- 
copy measurements (Fig. 1), abcg5 seedlings 
transmitted more light than the WT (Fig. 4A). 
Moreover, we filled the air spaces in WT seed- 
lings by water infiltration and found that in- 
filtrated WT samples showed optical properties 
similar to those of the abcg5 mutant. By con- 
trast, infiltration of abcg5 samples did not lead 
to any significant changes in optical properties 
(Fig. 4A). Our data showed that abcg5 mutants 
and infiltrated seedlings showed reduced dif- 
fused transmitted light, reflected light, and dif- 
fused reflected light (Fig. 4A). These data are 
consistent with air channels that enhance light 
scattering in plant tissues owing to the differ- 
ence in the refractive index of air compared 
with water, cellular fluid, and cell walls (36). 
To understand how changes in optical proper- 
ties affect the light microenvironment within 
the hypocotyl, we used microscopy. We recon- 
structed hypocotyl transverse cuts using Z 
stacks of transmitted light images. These images 
showed a similar pattern in all samples, but the 
contrast was higher in the WT compared with 
the abcg®, cristal mutants, and infiltrated sam- 
ples, indicating stronger light scattering in the 
WT (Fig. 4B and fig. S15A). Moreover, by com- 
bining these images with images of membrane- 
associated fluorescent proteins, we determined 
that the areas of strong scattering coincided 
with the intercellular spaces at tricellular 
junctions (white arrows in Fig. 4C and fig. 
S15, B and C). We conclude that intercellular 
air channels contribute to light scattering, 
thereby limiting light transmittance across 
the hypocotyl. 

To visualize the light gradient across an 
etiolated hypocotyl, we used a pPHOTI:PHOTI- 
GFP line either in photIphot2 (38) (WT ABCG5) 
or in a photIphot2abeg5 (abcg5) background. 
We obtained Z stacks across the entire width 


Nawkar et al., Science 382, 935-940 (2023) 


24 November 2023 


Brassica 


Arabidopsis 


Arabidopsis 
100 
-® Control 
80- -=- Infiltrated " 


Bending angle (degrees) 09 
for) 
oO 


Time (h) 
100 
-® Control 
80-| -=- Infiltrated 


Bending angle (degrees) © 
gS 


Brassica 
C 100 
o -® Control 
© 80- -s- Infiltrated 
fo} * 
oO 
ZS 604 
2 
2 40- 
oO 
2 20-4ns } 
me} 
o 
a 
T T 
0 10 20 
Time (h) 
E 100 
e -® Control 
© 80> -s- Infiltrated 
Lo2) 
oO 
S 60-4 
2 
2 40+ 
w 
2 20-4ns 
no} 
5 
om 
ie} 10 20 
Time (h) 


Fig. 3. Air channels are required for phototropism in Arabidopsis and Brassica seedlings. (A) Picture 
of representative 3-day-old etiolated seedlings. Scale bar, 1cm. (B and C) Phototropism in 3-day-old etiolated 
seedlings of (B) Arabidopsis and (C) Brassica. (D and E) Gravitropism in 3-day-old etiolated seedlings of (D) 
Arabidopsis and (E) Brassica. 3-day-old etiolated seedlings were infiltrated with water and 0.01% Silwett, applying 
vacuum. Infiltrated and control seedlings were immediately placed in vertical plates and irradiated with 0.025 umol 


mst 


unilateral blue light, or plates were rotated 90° at time O hours. See growth phenotype in fig. S12. Data 


are mean + 95% Cl of 18 Arabidopsis seedlings or 24 Brassica seedlings. Asterisks represent significant differences 
between control and infiltrated samples in a two-way ANOVA followed by Sidak’s multiple comparisons test 
(*P < 0.05). Experiments were repeated at least three times with similar results. 


of the hypocotyl using a confocal microscope 
equipped with a blue-light laser. As observed 
previously (9, 38), the GFP signal was strongest 
in cortex cells owing to the pPHOT] expression 
pattern (Fig. 4,C and D). In the WT background, 
we noticed GFP signal gaps along the plasma 
membrane that correspond to the position of in- 
tercellular air spaces formed between the epider- 
mis and cortex or cortex-cortex junctions (Fig. 4C, 
fig. SI5B). These fluorescence gaps were not 
observed in abcgé (fig. S15 B). Similar gaps were 
observed when we used a line expressing 35S 
promoter-driven plasma membrane-associated 
GFP (35S:myri-GFP) in the WT, indicating that 
the effect of air channels on fluorescence visual- 
ization is not specific to PHOT] and allowing 
visualization of the effect of air channels in the 
epidermis (fig. S15C). To determine the effect 
of light scattering on the light gradient across 
the hypocotyl, we compared the PHOT1-GFP 
fluorescent signal on the lit versus shaded sides 
of the hypocotyl (Fig. 4D). Our quantifications 
showed that the light gradient was significantly 
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steeper in the WT than in the abcg5 mutant 
or infiltrated seedlings (Fig. 4D and fig. S15D). 
Taken together, our results show that air chan- 
nels present in the WT enhance light scatter- 
ing, thereby leading to a stronger light gradient 
across a unilaterally ilhiminated hypocotyl. Char- 
acterization of the abcg5 mutant and water- 
infiltrated Arabidopsis or Brassica hypocotyls 
shows that air channels are functionally rele- 
vant for the response to directional light cues. 


Discussion 


The tissue properties responsible for light-gradient 
establishment leading to phototropism have 
been debated (7). Some experimental evidence 
supports the importance of light absorbing 
pigments (31, 39). However, a role for light 
diffusion by refraction and reflection when light 
traverses a tissue with different refractive in- 
dices (RI; air RI = 1, cell wall RI = 1.42, cellular 
fluid RI =1.33) has also been proposed (36). 
Our data from the abcg5 mutant and water- 
infiltrated hypocotyls, coleoptiles, and leaves 


4 of 6 


RESEARCH | RESEARCH ARTICLE 


A Transmittance Diffused transmittance 
95 11 
= 
D 
= 90 10 
Cc 
) 
3 
(2) 
= 85 9 
[e) 
rd 
80 8 
400 500 600 700 400 500 600 700 


Wavelength (nm) 


Wavelength (nm) 


¥ 


abcg5 


ri | 


Fig. 4. Air channels create a directional light signal in Arabidopsis hypoco- 
tyls. (A) Optical properties of seedlings. Data are mean + SEM of five 
(transmittance) or eight (reflectance) samples from two experiments. Treat- 
ments were compared by calculating the area under the curve in the blue range 
of the spectrum (400 to 500 nm). ANOVA followed by Tukey's multiple 
comparisons test showed significant differences between WT and all other 
treatments (*P < 0.001). (B, ©, and D) XZ orthogonal views of bright field and/ 
or fluorescence confocal images of etiolated hypocotyls. Blue arrows indicate 
the direction of the incident light. (B) (Left) Representative images in grayscale 


400 


WT infiltrated 


Reflectance Diffused reflectance 
= —_— wT 
4 — abcg5 
—_— WT_inf 
3 ——  abcg5_inf 
Mock 
2 tener 
1 
e) 
500 600 700 400 500 600 700 
Wavelength (nm) Wavelength (nm) 
*WT = abcg5 


¥ 


abcg5 infiltrated y 


Interaction * 


Gray value SD 


- WT = abcg5 


Relative mean GFP intensity 


Shaded/Lit ratio 0.36 


hypocotyls expressing pPHOT1:PHOTI-GFP (green) in an ABCG5 WT background. 
(Top) Bright field; (bottom) fluorescence. White arrows indicate the intercellular 
spaces between epidermal and cortex cells. Scale bar, 10 um. (D) (Left) 
Fluorescence images of etiolated hypocotyls expressing pPHOT1:PHOT1-GFP (green) 
in an ABCG5 WT or mutant (abcg5) backgrounds. Scale bar, 50 um. (Right) 
Quantification of images. For each image, the mean GFP fluorescence of the 

whole transverse cut, as well as that of the half closer to the light source (lit) and 
farther from it (shaded), were quantified. Plotted values are mean GFP fluorescence 
of each side relative to the mean GFP fluorescence of the whole image. The shaded/ 


(upper row) or made by using an artificial look-up table (lower row). Scale 
bar, 50 um. (Right) Quantification of images. Data are mean + SEM of five 


(noninfiltrated) or six (infiltrated) samples. *P < 0.00 
Sidak's multiple comparisons test versus WT noninfi 


confirms the importance of intercellular air 
channels in limiting light transmittance. More- 
over, it enables us to show the functional im- 
portance of air channels for phototropism in 
Arabidopsis and Brassica seedlings. The remain- 
ing phototropic response in the abcg5 mutant 
can be explained by a shallower light gradient 
across the hypocotyl due to the difference in 
refractive indices between cellular fluids and 
the cell wall. Indeed, a light gradient across 
sunflower hypocotyls could be largely elim- 
inated by oil infiltration, which led to a medium 
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with a homogeneous RI (36). Our work sug- 
gests that ABCG5 is important to establish or 
maintain air channels that were reported in the 
hypocotyl of several species (33-35, 40). This is 
consistent with ABCGS expression in the cortex 
of embryonic tissues and ABCG5 subcellular 
localization (22, 25). Intercellular spaces are 
formed at tricellular junctions by degradation 
of the middle lamella (41, 42). Air channels in 
intercellular spaces form early in development, 
as they are present in dry Arabidopsis seeds 
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(34). abcgs mutant seeds are denser than WT 


https://avxhm.se/blogs/hillO 


lit ratio is calculated from these relative values. ANOVA showed that there is an 
interaction effect of genotype x side, and Sidak's multiple comparisons showed 
that there is a significant difference in fluorescence between the lit and shaded sides 
in both WT and mutant genotypes (*P < 0.05). n = 14. 


seeds, and microscopic examination of abcg5 
embryos shows that the intercellular spaces are 
present but filled with electron-dense material. 
In hypocotyls of etiolated seedlings, we show 
that instead of air, as in the WT, abcgd inter- 
cellular spaces are filled with a fibrillar net- 
work that appears to be detaching from the 
cell wall. An electron-dense layer surrounding 
air channels in intercellular spaces was de- 
scribed in numerous species (47-43). This layer 
is present in WT Arabidopsis etiolated seedlings 
but is reduced or absent in abcg5 mutants. The 
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composition of this structure remains largely 
unknown. It was proposed that this layer com- 
prises a lipophilic film analogous to suberin or 
cutin (40). Moreover, pectins are enriched in 
the cell walls lining intercellular spaces, and a 
role for lignin was also proposed to maintain 
air in the channels (41, 44). Air spaces in plants 
are widely required, including for photosyn- 
thetic capacity in leaves or flooding tolerance 
in roots, but the mechanisms underlying their 
ontogeny remain poorly understood (42). Our 
work identifies an element in the development 
of air channels and shows that they shape the 
optical properties of tissues, providing direc- 
tional light information to plants. This may rep- 
resent an example of exaptation, as a tissue 
feature required for gas exchange in the embryo 
and seedling (33-35) also provides hypocotyls 
with the ability to respond to light direction. 
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POLLUTION 


Mortality risk from United States coal 


electricity generation 


Lucas Henneman’*, Christine Choirat?, Irene Dedoussi*, Francesca Dominici’, 


Jessica Roberts®, Corwin Zigler”® 


Policy-makers seeking to limit the impact of coal electricity-generating units (EGUs, also known as power 
plants) on air quality and climate justify regulations by quantifying the health burden attributable to 
exposure from these sources. We defined “coal PM25” as fine particulate matter associated with coal EGU 
sulfur dioxide emissions and estimated annual exposure to coal PM25 from 480 EGUs in the US. We estimated 
the number of deaths attributable to coal PMz.5 from 1999 to 2020 using individual-level Medicare death 
records representing 650 million person-years. Exposure to coal PM2.5 was associated with 2.1 times greater 
mortality risk than exposure to PM2 5 from all sources. A total of 460,000 deaths were attributable to coal 
PM25, representing 25% of all PM25-related Medicare deaths before 2009 and 7% after 2012. Here, we 
quantify and visualize the contribution of individual EGUs to mortality. 


ir pollution exposure is associated with 
adverse health effects and increased risk 
of death (J-4). Coal electricity-generating 
units (EGUs), or power plants, are a major 
contributor to poor air quality (5-7). Coal, 
historically a relatively inexpensive fuel, is 
burned to provide electricity worldwide even 
as the US and other nations continue to de- 
bate whether it should remain a part of the 
energy portfolio amid public health and cli- 
mate concerns. Global coal use for electricity 
generation is projected to increase (8), and 
ongoing instability has pushed European na- 
tions to increase coal use (9, 10). Although coal 
EGU air pollution emissions have declined in 
the US in recent decades (17), defining the 
health burden posed by coal EGUs and the 
benefits of actions that have reduced EGU emis- 
sions remains paramount to informing public 
health, climate, and energy policies in the US 
(12) and worldwide. 
Previous studies that quantified the mor- 
tality burden from coal EGUs in the US (13-18) 


relied on estimated concentration response 
functions (CRFs), which assume that fine par- 
ticulate matter (PM, ;) from coal emissions has 
the same toxicity as PM, ; from all sources. How- 
ever, evidence indicates (19-25) that exposure to 
sulfur, sulfates, or PM, from coal emissions 
may be associated with higher relative morbid- 
ity or mortality risk than that to other PM, con- 
stituents or PM, from other sources per unit 
concentration, although uncertainty remains 
(26, 27). The limited regional (19-22) and tem- 
poral (23-25) scope of previous studies, along 
with the lack of availability of coal-specific ex- 
posure estimates, has hindered the adoption 
of coal-specific PM,,,; CRFs in mortality burden 
calculations, likely leading to underestimates 
of the mortality burden associated with coal 
EGUs. In addition, previous studies lack tar- 
geted evidence regarding which coal EGUs are 
most responsible for increased mortality risk, 
and this information is needed to inform policies. 

To estimate the number of deaths associa- 
ted with exposure to coal PM.; from EGUs, we 


nm. 
conducted a national-scale study of indivic See 
level health records covering >650 mili... — 
person-years in the US Medicare population 
(265 years of age) from 1999 to 2016 (unless 
otherwise noted, populations throughout this 
study refer specifically to the Medicare popu- 
lation) (28). We defined “coal PM, ;” as PMo 5 
from coal EGU SO, emissions. We estimated 
coal PM, using the HYSPLIT with Average 
Dispersion (HyADS) model, which accounts 
for date-specific atmospheric transport of PM, 5 
to characterize exposure to PM, ; from individ- 
ual EGUs (29-32). We used HyADS, a reduced 
complexity model, to estimate 22 years of ex- 
posure to coal PM,; (from 1999 to 2020) from 
each of 480 US EGUs. These calculations would 
have required multiple orders of magnitude 
more computation time using a typical full- 
scale chemical transport model. 

Our study offers the following contribu- : 
tions. First, we estimated and compared mor- 
tality risk associated with exposure to coal 
PM, versus total PM,; from all sources, 
showing that previous analyses underestimated 
the mortality burden from coal EGUs in the US. 
Second, we calculated the number of deaths 
linked to each of the 480 coal EGUs, ranking ‘ 
each with respect to its contribution to the 
mortality burden and tracking its contribu- 
tion to the overall mortality burden over time 
amid implementation of emissions controls ‘ 
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Fig. 1. ZIP code-level coal PM25 over time. Box plots (median, first, and third quartiles are shown as horizonal lines and outliers as dots) summarize the distribution 
of ZIP code levels of coal PMzs5. Map areas shown in white do not have ZIP codes. Plots were produced in R using ggplot2; spatial information comes from the 


USAboundaries package. 
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and retirements. Third, we documented the 
spatial distribution of the mortality burden 
across the US. 


Results 
Changes in exposure to coal PM25 over time 


By averaging ZIP (postal) code levels of coal 
PM, across the conterminous US, we found 
that the annual average coal PM,; declined 
from 2.34 wg m? (range, 0.01 to 8.80) in 1999 
to 0.07 ng m™? (range, 0.00 to 0.39) in 2020 
(Fig. 1). Coal PM, was elevated in the eastern 
US relative to the western US, with annual 
average concentrations exceeding 4 ug m®? 
in multiple ZIP codes in all years from 1999 to 
2008. Coal PM.; exposure is a combination 
of emissions from nearby and distant EGUs 
(figs. S1 and 82). 


Coal PM25 CRF 


The Medicare dataset contains records of 
32.5 million deaths from 1999 to 2016 (table S1), 
with the annual number of deaths increasing and 
death rates decreasing across the study period 
(fig. S3). We found that a 1 ug m® increase in 
annual average coal PM, was associated with 
a 1.12% increase in all-cause mortality [relative 
risk (RR): 1.0125; 95% confidence interval (CI): 
1.0113 to 1.0137]). This risk is ~2.1 times greater 
than the RR associated with exposure to PM, 5 
from any source (1.0060 per ug m?; 95% CI: 
1.0053 to 1.0067), which was estimated by 
Wu e¢ al. in 2020 in the same Medicare cohort 
using an analogous statistical model (4). 


Number of excess deaths attributable to 
coal PM25 


For each year from 1999 to 2020, we estimated 
the excess number of deaths attributable to coal 
PM,.; relative to what would have occurred 
assuming zero SO, emissions from coal EGUs 
(i.e., coal PM, 5 = 0). Summing over the study 
period, we estimated that 460,000 (95% CI: 
420,000 to 500,000) deaths would have been 
avoided if all coal EGU SO, emissions were 
eliminated (Fig. 2 and table $2). Annual ex- 
cess deaths attributable to coal PM2.; were 
highest between 1999 and 2007, averaging 
more than 43,000 deaths per year for a total 
of 390,000 (95% CI: 360,000 to 430,000). After 
2007, annual excess deaths declined substan- 
tially, reaching 1600 (95% CI: 1400 to 1700) in 
2020. The total number of deaths in the Medi- 
care population for the period 1999 to 2020 
was 38.6 million (we projected annual deaths 
in each ZIP code for the period 2017 to 2020 as 
the average from 2014 to 2016; fig. S3). There- 
fore, Medicare deaths associated with coal 
PM,..5 exposure represent 1.2% (95% CI: 1.1 to 
1.3%) of all Medicare deaths. Changes in base- 
line mortality rates had a much smaller influence 
than changes in coal PM, ; on the variability in 
annual deaths associated with coal PM, 5 since 
1999 (figs. S4 and S5). 
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Fig. 2. Annual number of excess deaths attributable to coal PM2 5, estimated using the RR for coal 
PM25 from this study and RRs for total PM25 from the literature. All excess deaths are estimated 
relative to zero coal PMz5. The area filled by horizontal hashing indicates deaths estimated using RRs derived 
from this study (bounds represent 95% Cl). Areas filled by vertical and diagonal hashing correspond to 
deaths estimated using RRs for total ambient PM25 exposure from the literature (4, 33). The gray shaded 
region from 2017 to 2020 represents years for which ZIP code-specific baseline death rates were assumed 
from the 2014 to 2016 average. This figure was produced in R using ggplot2. 


The estimated RR for coal PM, from the 
statistical model was higher than the previous- 
ly estimated RRs for total PM.; exposure that 
are often used for risk assessments, implying 
that the number of excess deaths attributa- 
ble to coal EGUs was underestimated in prior 
studies (13-18). For example, by combining 
coal PM, exposure with two RRs for total 
PM.,.; previously used in risk assessments, 
1.0060 per 1 wg m~? (95% CI: 1.0053 to 1.0067) 
estimated for the Medicare population (4) and 
1.6% per 10 pg m™® (95% CI: 1.4 to 1.8) esti- 
mated for the general population (33), we esti- 
mated 240,000 (95% CI: 220,000 to 260,000) 
and 200,000 (95% CI: 130,000 to 280,000) 
excess deaths from coal EGUs, respectively 
(Fig. 2). 

We compared mortality from coal PM, ; es- 
timated in the main analysis with an aggre- 
gate health burden associated with total PM. ; 
from all sources. Using the RR reported by 
Wu et al. (4) for the Medicare population and the 
same annual PM, ; exposure used in that analy- 
sis, we calculated 2,000,000 excess deaths due 
to ambient PM, ; from 2000 to 2016 relative to 
a PM, concentration of 0 (a portion of these 
excess deaths is attributable to natural emis- 
sions sources). Thus, our estimates imply that 
exposure to coal PM.;5 was associated with 
25% of all PM. ;-related Medicare deaths from 
2000 to 2008 and with 7% of all PM, deaths 
from 2013 to 2016 (fig. S6). 


Individual EGU contributions to mortality burden 


We identified 138 of the 480 coal EGUs that 
were associated with >1000 excess deaths across 
the study period and 10 EGUs that were asso- 
ciated with >5000 deaths (Fig. 3). Although 
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EGUs east of the Mississippi River were asso- 
ciated with the greatest numbers of deaths 
because of their high emissions and proxim- 
ity to population centers, each geographical 
region contained at least one EGU associated 
with >400 deaths. The distribution of EGU- 
specific deaths was heavily skewed; 91% of the 
total deaths were associated with EGUs that 
accounted for 50% of nationwide coal EGU 
SO, emissions during the study period. Nor- 
malizing excess deaths by energy produced 
may rank EGUs differently. 

Figure 4 shows the temporal trend in the 
number of deaths associated with each EGU, 
highlighting the two most harmful coal EGUs 
within each region. Large declines in the num- 
ber of deaths corresponded with SO» emission 
control installations and facility retirements. 
For example, for the Keystone facility in 
Pennsylvania, the average annual number of 
attributable deaths was 640 (95% CI: 580 to 
700) before 2008, but declined to 80 (95% CI: 
70 to 90) after scrubber installations in 2009 
to 2010. We developed an interactive tool to 
examine individual EGUs and their contri- 
butions to state-specific Medicare deaths in 
relation to SO, emissions control installations 
and unit retirements (34). 


Sensitivity of results to unmeasured 
confounding 


The stratified Poisson regression for estimat- 
ing the CRF was chosen based on its use in 
previous health impact studies of exposure to 
total PM,.; in the Medicare population. The 
log-linear CRF implied by the model was chosen 
to facilitate the source-specific attribution of 
health impacts, but it may not reflect the true 


2 of 6 


RESEARCH | RESEARCH ARTICLE 


Midwest east 


154,800 
(140,400-169,200) 


Mid-Atlantic 
117,400 
(106,400-128,300) 


Southeast 


94,500 
(85,800-103,300) 


we J 


1999-2004 
(0) 2005-2010 
[2011-2016 
‘2017-2020 


\ 


\ Ss 


[Kyger Creek 


OH) [J M Stuart, OH) [Conesv 


/ | 
ille, OH] {WH Sammis, OH] {Mo 


nroe, MI] {Gibson, IN]{Muskingum River, OH] 


| \ 


/ 


| l 
Amos, WV] (Paradise, KY] (Hatfield's Ferry, PA] (Homer 


\ 
City, PA] (Keystone, PA] 


/ 


\ \ 
\ \ 


{Wansley (6052), GA| (Scherer, GA] (Johnsonville, TN] 


{E C Gaston, AL) Bowen, GA’ 
Southcentral we / i 
40,300 pte / = 
86;00-44,000) {Monticello, TX] {Martin Lake, TX] {Big Brown, TX] = 
4 West Midwest west) (Midwest east 
} - = Fe Vas [Noniiessi l 
Midwest t a al 
sete (+ 1 odie =e: 
(32,100-38,700) r = 2 
\ “Fp A 
[Sioux, MO} {Labadie, MO} ae 
"Sy as * aad a ei> 
= 
2.100 The ee * 
| 2] 
} : — e 
een) Dave Johnston, WY 0 250 500 7501000 nee 
Northeast SO, emissions [million kg] ) 
ieitran - NRG Dunkirk Power, NY 
0 2500 5000 7500 


Excess Deaths 


Fig. 3. Excess deaths associated with individual coal EGUs from 1999 to 2020. EGUs (N = 480) are organized by region to improve interpretability, and the facilities 
associated with the most deaths are labeled. Inset: total SOz emissions by location from 1999 to 2020 (hexagonal grids may include multiple EGUs) and 
regional boundaries. Plots were produced in R using ggplot2; spatial information comes from the USAboundaries package. 


relationship between coal PM, and mortality 
risk across all exposure levels during the study 
period. Although the stratified Poisson model 
adjusts for many confounders and has been 
shown in the context of total PM, exposure 
to be robust to a variety of strategies for con- 
founding adjustment, we cannot rule out the 
possibility that unmeasured factors related to 
mortality risk vary systematically with coal 
PM, in a manner not captured by observed 
characteristics in the model. Using the E-value 
(34, 35), we found that a potential confounder 
would need to have an association with both 
mortality rate and coal PM.; of 1.125 (lower 
confidence interval: 1.118) on the RR scale to 
explain away the association between mortal- 
ity and coal PM, 5. 

To explore the potential confounding by air 
pollution sources other than coal PM, 5, we 
performed several additional sensitivity analy- 
ses. We present coal PM.; RRs from models 
that adjust for total PM, ;, residual PM, ; (total 
PM..5 minus coal PM.) as a marker for all 
other sources, NO, as a marker for primary 
traffic-related air pollution, and both NO, and 
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residual PM, ; (table S3). Adjusting for total 
PM,,.; attenuated the risk of coal PM..; sub- 
stantially, which is consistent with coal PM. 5 
being captured by the total PM»; metric. When 
including residual PM,; and/or NOs, we found 
a slight attenuation in RR from the main mod- 
el, and the RR for coal PM. remained higher 
than the RR for total PM, found by Wu et al. 
(4). Including markers for other PM. sources 
as confounders introduced important limita- 
tions, as explained in the supplementary text. 
Furthermore, we implemented a “first-differences” 
analysis of within-ZIP code changes over time 
to adjust for observed and unobserved differ- 
ences across ZIP codes (fig. S7). This analysis 
addresses possible threats to validity caused 
by confounding differences across different 
locations, providing strong evidence that areas 
experiencing larger decreases in coal PM, 5 
also experienced larger decreases in mortality 
rates. Results from this analysis support the 
validity of the primary analysis to quantify the 
mortality burden with a relative risk adjusted 
for individual- and ZIP code-level confounders 
measured throughout the entire study period. 
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Sensitivity of results to HyADS characterization 
of coal PM25 

HyADS rescales air parcel location counts ex- 
tracted from HYSPLIT to coal PM..; using a 
single year’s chemical transport model output, 
which may introduce errors. Our comparisons 
(30) of coal PM, with coal PM,; source im- 
pacts from year 2006 Hybrid CMAQ-DDM, a 
full form model (FFM) bias corrected against 
observations, confirmed that the spatial pat- 
tern is well captured and that error and bias 
are within the typical range of FFMs (although 
the bias-corrected CMAQ-DDM itself has un- 
certainty). Because we expected potential errors 
in coal PM, ; to be smaller in years surrounding 
the year when the scaling was performed, we 
retrained the Poisson regression model three 
times using data only from subsets of the total 
years available (1999 to 2003, 2004: to 2007, 
and 2008 to 2016). The estimated RRs from 
coal PM», were comparable but slightly larger 
than in the main analysis from the 1999 to 2003 
and 2004 to 2007 models, with a more pro- 
nounced difference in RR from the 2008 to 2016 
model (table S3). The change in RR across 
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Fig. 4. Total annual excess deaths associated with each of the coal EGUs in each region, with the two most harmful facilities highlighted. Scrubber 
installations designate the earliest year that a scrubber was installed at one or more of each EGU’s units [facility information from (46)]. Plots were produced 


in R using ggplot2. 


different periods may be consistent with either 
genuine changes in risk or deterioration of 
HyADS’ performance in years further from the 
year on which the scaling was based (2005). 

To explore sensitivity to the process for scal- 
ing HyADS to coal PM, 5, we estimated the RR 
and corresponding excess deaths from coal 
EGUs using unscaled air parcel counts for each 
ZIP code output by HyADS and found similar 
estimates of attributable deaths (650,000; 95% 
CI: 590,000 to 710,000). Comparing coal PM, 5 
estimates against observed sulfate PM. 5 at 
rural monitors indicated that HyADS may have 
underestimated exposure and exaggerated ex- 
posure declines during the study period (a por- 
tion of this decline in coal PM, is attributable 
to decreasing EGU contributions to total US 
SO, emissions). A sensitivity analysis using a 
sulfate-adjusted coal PM, metric (fig. S8) es- 
timates a mortality RR of 1.0147 (95% CI: 1.0135- 
1.0158) and 790,000 (95% CI: 720,000-850,000) 
excess deaths. These findings indicate that, to 
the extent that HyADS might underestimate 
coal-derived PM,.; and exaggerate exposure 
declines, it provides a conservative estimate of 
the mortality burden associated with exposure 
to SO, emissions from coal EGUs. Future studies 
may use newly developed approaches for esti- 
mating CRFs that account for uncertainty in 
air pollution exposure (36). 

We used SO, emissions from coal to derive 
coal PM, ; because of evidence that secondary 
PM, ; from SO, emissions constitutes most of 
the ambient PM, from coal EGUs during the 
study period (13, 17, 37). Because SO. emissions 
and related atmospheric physical-chemical 
processes that increase ambient PM, ; are cor- 
related with complementary processes of other 
species, e.g., primary PM,; and NO,, coal PM, 5 
captures the influence of these other species. 
Although primary PM.; emissions are not mea- 
sured at each EGU, estimated nationwide an- 
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nual primary PM,,, EGU emissions are correlated 
(R? = 0.97) with measured nationwide annual 
SO, emissions (38). Sensitivity analyses using 
observed sulfate and comparisons with alter- 
native modeling strategies revealed broad con- 
sistency with the primary analysis, particularly 
in EGU relative rankings by excess deaths, in- 
dicating bounds on uncertainties associated 
with the diversity of technologies and assump- 
tions available for assessing exposure to EGU 
SO, emissions. 


Comparison with deaths estimated using a 
chemistry-transport air quality model 


Although it is impossible to directly validate 
the estimated number of excess deaths attrib- 
utable to coal PM,.;, we compared our results 
with analogous coal EGU health burdens de- 
rived using atmospheric sensitivities from an 
FFM. Using GEOS-Chem adjoint PM, ; sensi- 
tivities (13) and the coal PM..; RR from the 
main analysis, we estimated 20,000 (95% CI: 
19,000 to 22,000) and 13,000 (95% CI: 12,000 
to 14,000) excess deaths in 2006 and 201], re- 
spectively (these years were chosen to span 
emissions reductions after 2006 and to align 
with previously published GEOS-Chem adjoint 
results). These values are comparable, although 
smaller (especially in 2006) than the excess 
deaths estimated from coal PM,; exposure in 
this study of 35,000 (95% CI: 32,000 to 38,000) 
and 15,000 (95% CI: 13,000 to 16,000) in 2006 
and 2011, respectively. Correlations between the 
number of deaths assigned to each coal EGU 
by HyADS and GEOS-Chem adjoint were high 
(R? > 0.85) for all EGUs and for EGUs in most 
regions (fig. S9 and table S4), and the two 
models rank ordered EGUs similarly by their 
associated deaths (fig. S10). Mean differences 
in nationwide HyADS EGU-specific death esti- 
mates relative to the chemical transport model 
were higher in 2006 (71% for all EGUs) than in 
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2011 (15%). Although GEOS-Chem adjoint re- 
sults from the 2 years available are difficult 
to project to all 22 years of this study, and be- 
cause chemical transport models, including 
GEOS-Chem, have uncertainties due to potential 
bias in emissions inputs, model parameter- 
izations, and meteorology, agreement between 
the models at levels consistent with previous 
studies (39) increases confidence in the results 
reported here. 


Implications 


We conducted the longest-term national study 
to date estimating the excess number of deaths 
associated with exposure to SO, emissions from 
US coal EGUs. A key innovation in this study is 
the combined use of coal EGU-specific expo- 
sure estimates and individual-level health data 
on the same population during the same time 
period to estimate the mortality burden. This 
approach has been hampered until now by the 
limited availability of large-scale health data- 
bases and source-specific exposure estimates. 
Our approach illustrates the utility of deriving 
air pollution exposure with a combination of 
dispersion-based and chemical transport models 
in epidemiological and risk assessment for well- 
characterized sources. 

We found that, over the past two decades in 
the US, coal PM; was associated with 460,000 
extra deaths, constituting >22% of total excess 
deaths attributable to PM,;. We also found 
that the mortality burden of coal PM, has 
been underestimated using traditional impact 
assessments that rely on CRFs for total PM, 5 
mass (13, 16-18, 39-41). The elevated mortality 
RR associated with annual exposure to coal 
PM, aligns with previous evidence of in- 
creased relative health risks associated with coal- 
related PM, ; or sulfur or sulfate exposure per 
unit concentration (19-25), although other studies 
have found little evidence of increased risk 
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related to secondary sulfate PM.,; or PM2; 
associated with coal (26, 27). Large decreases 
in annual deaths across the study period high- 
light the success of emissions reductions brought 
about by regulations under the 1990 Clean Air 
Act Amendments. Although coal use in the US 
has remained low, global use is expected to 
increase and plateau by 2025 (8), suggesting 
the potential for high mortality costs from coal 
for years to come. 

We used SO, emissions from coal to derive 
coal PM, ;; however, we cannot conclude that 
the portion of ambient PM, ; associated with 
SO, emissions emitted from coal power plants 
is more or less harmful than ambient PM. 
from other species emitted from coal power 
plants. Disentangling the mortality risks of 
the various PM,.; species emitted from coal 
EGU emissions is not possible within our mod- 
eling framework because of the high correla- 
tion between species emitted from coal EGUs 
such as NO, and primary PM, 5. Given how we 
estimate exposure to “coal PM, 5,” our find- 
ing of a higher mortality risk of exposure to 
coal PM, ; relative to other PM. suggests the 
potential for population health benefits of re- 
ducing SO, emissions from coal power plants, 
for example, by installing emissions control 
devices or shutting coal facilities completely. 
Full separation of the health impacts of var- 
ious emitted species from coal EGUs is of ad- 
ditional interest to policy-makers because of 
the varying technologies available to reduce 
EGU emissions of specific pollutants, and it 
should be considered in future studies. 

HyADS benefits from well-characterized source 
locations and emissions, along with the rela- 
tively slow atmospheric transformation of 
emitted SO, to particulate sulfate. Expanded 
incorporation of information from observa- 
tion and chemical transport model-based source 
apportionment techniques in reduced com- 
plexity models may enable linkages between 
emitted species beyond SOs, atmospheric pro- 
cesses, exposure, and health outcomes. Although 
source-specific PM, cannot be directly mea- 
sured, observation-based receptor methods for 
PM25 source apportionment (42) could pro- 
vide an approximate ground truth (albeit with 
their own uncertainties) for evaluating mod- 
eled source-specific exposure. Advanced sensi- 
tivity approaches incorporated within chemical 
transport models, such as GEOS-Chem Adjoint 
used here, and sensitivity methods such as the 
direct decoupled method (DDM) (43) or the 
integrated source apportionment method (ISAM) 
(44) offer model-based approaches that more 
explicitly incorporate atmospheric chemistry 
and physics. Expanding computational capac- 
ity will make comparisons with these types of 
models in applications with many sources in- 
creasingly feasible. 

These results advance the growing body of 
evidence showing varying toxicity of PM; orig- 
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inating from different sources. Although the US 
and other countries continue to regulate total 
ambient PM. concentrations, entities such as 
the EPA Clean Air Scientific Advisory Commit- 
tee have specifically cited a need for research to 
assess health effects associated with changes in 
PM, composition and sources over time as an 
important consideration for future PM, ; policy 
assessments (45). Our findings have implica- 
tions for current air pollution risk assessments, 
which incorrectly assume equal toxicity for am- 
bient PM..; from all sources and for all loca- 
tions. The research platform that we used to 
quantify exposure associated with individual 
coal EGUs, which accounts for pollution trans- 
port and location relative to population centers, 
can support more efficient regulatory efforts 
by producing targeted evidence of how indi- 
vidual EGU sources contribute to the existing 
health burden. 
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Rhodium catalyzed tunable amide homologation 
through a hook-and-slide strategy 


Rui Zhang, Tingting Yu, Guangbin Dong* 


Preparation of diverse homologs from lead compounds has been a common and important practice in 
medicinal chemistry. However, homologation of carboxylic acid derivatives, particularly amides, remains 
challenging. Here we report a hook-and-slide strategy for homologation of tertiary amides with tunable 
lengths of the inserted carbon chain. Alkylation at the a-position of the amide (hook) is followed by 
highly selective branched-to-linear isomerization (slide) to effect amide migration to the end of the newly 
introduced alkyl chain; thus, the choice of alkylation reagent sets the homologation length. The key step 
involves a carbon-carbon bond activation process by a carbene-coordinated rhodium complex with 
assistance from a removable directing group. The approach is demonstrated for introduction of chains 
as long as 16 carbons and is applicable to derivatized carboxylic acids in complex bioactive molecules. 


he term “homologs,” initially introduced 
to chemistry by Gerhardt in 1843, refers 
to a series of molecules exhibiting vary- 
ing carbon-chain lengths but otherwise 
identical functionality (7). Although these 
molecules typically manifest similar chemical 
reactivities, the discrepancies in chain length 
often exert a profound influence on their phys- 
icochemical properties and molecular con- 
formation, consequently resulting in distinct 
lipophilicity, solubility, pk, values (where K, is 
the acid dissociation constant), and binding 
affinity (2). For instance, the potency of a class I 
histone deacetylase (HDAC) inhibitor increased 
>540-fold after elongation of the amide chain 
by three carbons (3). Similarly, the two carbon- 
extended homolog of a thromboxane A, re- 
ceptor antagonist was found to be 500 times 
more active than the corresponding shorter 
one (Fig. 1A) (4). As such, synthesis of diverse 
homologs derived from lead compounds has 
been a critical strategy in medicinal chemistry 
research. Nevertheless, direct homologation at a 
common functional group (FG)—a process en- 
tailing insertion of one or multiple methylene 
units into preexisting molecular frameworks— 
remains a nontrivial endeavor (5). 
Homologation of reactive carbonyl com- 
pounds, such as aldehydes and ketones, has 
been extensively investigated and used in the 
synthesis of complex molecules (5, 6). For ex- 
ample, aldehydes are readily converted to their 
one-carbon homologs through the Wittig re- 
action with a methoxy-derived ylide, followed 
by hydrolysis of the resulting enol ether (7). 
Ketones are homologated with diazo com- 
pounds through a nucleophilic addition and 
1,2-migration pathway (8) (Fig. 1B); they can 
also undergo two-carbon homologation by 
means of rhodium-catalyzed alkene insertion 
(9, 10). In addition, monomethylene insertion 
to carboxylic acids or esters can be achieved 
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through the Arndt-Eistert reaction (through acyl 
chlorides) (71), Barton’s decarboxylative ho- 
mologation (through activated esters) (12), or 
Kowalski’s ester homologation (with strong 
bases) (13). In contrast, despite being a prev- 
alent structural motif in medicinally impor- 
tant compounds, amides have proven to bea 
difficult class of substrates for homologation. 
This is primarily due to the inherent high sta- 
bility of the amide carbonyls, which limits the 
effectiveness of existing methods that typically 
rely on nucleophilic carbonyl additions. In ad- 
dition, most of the current homologation tech- 
niques offer only one-carbon homologation, or 
two at most, necessitating multiple iterations 
to insert methylene chains. Hence, an efficient 
and tunable amide homologation approach, 
which would be valuable for preparing diverse 
homologs, remains highly sought after. 


Hook-and-slide strategy 


To achieve the desired amide homologation, 
we conceived of a “hook-and-slide” strategy 
assisted by catalytic activation of the a-C—C 
bonds of the amide (Fig. 1C). The approach 
starts with simple alkylation at the o-position 
of amides using a common alkyl electrophile 
(e.g., halides or sulfonates) of a specific carbon- 
chain length. Next, a transition metal cata- 
lyst selectively activates and inserts into the 
a-C-C bond of the amide (or masked amide) 
by means of oxidative addition (14). Subse- 
quently, the amide (or masked amide) moiety 
slides along the newly hooked alkyl chain all 
the way to the terminal position through a series 
of B-hydrogen eliminations and metal—hydride 
reinsertions (15), followed by C-C reductive 
elimination to give the desired homologated 
product. With this hook-and-slide strategy, 
the homologation length would be identical to 
the carbon-chain length of the alkyl electro- 
phile, resulting in straightforward tunability. 

However, substantial challenges loomed. First, 
compared to C-C activation of ketones (16-20), 
cleavage of unstrained amide o-C-C bonds by 
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transition metals is extremely rare becau zs 
is generally much easier to cleave the mor. 
active C-N bond (27, 22). The sole reported 
example uses specially designed substituents 
on amide nitrogen and only cleaves C (aryl)- 
amide bonds (23). In addition, B-hydrogen 
elimination and metal—hydride reinsertion 
are generally reversible elementary steps in 
the chain-walking process (15). Achieving full 
selectivity for the terminal position remains 
challenging for analogous isomerization re- 
actions (24-26) and would require careful 
catalyst and substrate design. 

Inspired by the recent development of C-C 
activation in unstrained substrates (17, 27, 28), 
we hypothesized that oxidative addition of 
amide o-C-C bonds could be facilitated by a 
labile directing group (DG) conveniently in- 
stalled and then removed (20). Despite the 
fact that 2-aminopyridine has been success- 
fully used as a temporary DG (29) for ketone 
substrates, its installation to normal amide 
substrates poses a challenge. Unlike aldehydes 
and ketones that tend to condense with pri- 
mary amines to yield imines, tertiary amides 
preferentially undergo transamidation (30). 
Indeed, when amide 1a was used as a model 
substrate, common amide activating reagents 
(31), such as Tf20 and POCIs, failed to give the 
desired condensation product in our hands 
(Fig. 2A), which is probably due to the com- 
peting nucleophilicity of the pyridine moiety 
when reacting with the electrophile. Inspired 
by the seminal work of Kakimoto and co-workers 
(82), we found that polyphosphoric acid trime- 
thylsilyl ester (PPSE), prepared in situ from P.O; 
and TMS,0, can successfully promote the de- 
sired amidine formation between 2-amino-3- 
picoline and amide 1a in 51% yield. Further 
optimization (see table S1) revealed a surpris- 
ing effect of pyridine (2 equiv) as an additive, 
which enhanced the yield to nearly quantita- 
tive. Note that the amidine intermediate 2a is 
both air- and moisture-stable and can be easily 
purified by column chromatography. 

Having established the standard protocol 
for the DG installation, the key C-C activation 
step to slide the amidine moiety to the ter- 
minal position was examined next (Fig. 2B). 
When [Rh(C2H,4)2Cl]2 was used as the catalyst 
in toluene, several N-heterocyclic carbene (NHC) 
ligands were surveyed. Although NHC ligands 
with an unsaturated core, such as IPr and IMes, 
appeared to be ineffective, the saturated NHCs 
were found to be more reactive. For example, 
SIMes can afford the desired linear product 
3a in 73% yield. The SIDep ligand, featur- 
ing 2,6-diethylphenyl substituents, further 
increased the yield to 82%. The major side- 
product (10 to 15% yield) was identified as 
propylamidine 3a’, which could be generated 
from hydroamidination of ethylene from the 
rhodium precatalyst, likely owing to the high 
reactivity of ethylene. To suppress this pathway, 
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replacement of [Rh(CjH,4)sCl]. with a structurally 
well-defined catalyst, [Rh(coe\SIDep)Cl]p"hexane, 
which contains a less coordinative cyclooctene 
(coe) ligand, further improved the yield to 96%. 
The precoordinated catalyst showed higher re- 
activity compared with the in situ combination of 
[Rh(coe),Cl], and SIDep (Fig. 2B, entry 1). Using 
this new catalyst, C-C activation of 2a could take 
place even at 40°C in 75% yield (entry 2), which is 
in sharp contrast to the prior high temperature 
requirement in related ketone C-C activation 
(20). Other solvents—PhCl and 2-MeTHF, for 
example—were slightly less effective than toluene 
(entries 3 and 4). In addition, when the catalyst 
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loading was reduced to 2.5 mol %, the desired 
product could still be obtained in 82% yield 
(entry 5). Further reducing the catalyst loading 
led to lower conversion (entry 6), although most 
of the starting material could be recovered. 
Finally, the DG removal was studied under 
various conditions (Fig. 2C and see tables S4 and 
S5 for additional details). The treatment with 
HCl in water under reflux resulted in full hy- 
drolysis, and the corresponding carboxylic acid 
was obtained in nearly quantitative yield. The 
basic hydrolysis condition afforded the desired 
amide product, albeit with poor selectivity. Some- 
what surprisingly, simply heating amidine 3a in 
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neutral isopropanol and water delivered the de- 
sired amide product in 88% yield. The efficiency 
could be further improved to almost quantitative 
yield when (BugSn),0 was used as an additive 
(33) under pH neutral conditions after 36 hours. 


Scope of one-carbon homologation 


With the optimized reaction sequence in hand, 
the scope of the one-carbon homologation of 
tertiary amides was explored first (Fig. 3). The 
substrates can be easily accessed through a- 
methylation of the corresponding arylacet- 
amides. It appears that the electronic property 
of the aryl group does not have a substantial 
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Fig. 3. Substrate scope of one-carbon homologation. Reaction conditions: 
Stage |: 1 (1.5 mmol), DG-NH» (1.65 mmol), PPSE (24 mmol), and pyridine 
(3.0 mmol) at 140°C for 18 hours. Stage II: 2 (0.2 mmol), [Rh(coe)(SIDep)Cl]2-hexane 
(0.01 mmol), and toluene (0.5 ml) at 60°C for 24 hours. Then, (Bu3Sn)20 

(0.2 mmol), isopropanol (1.0 ml), and H20 (1.0 ml) at 100°C for 36 hours. The 
two yields are isolated yields for each stage. *Substrates la and 1aa to laj 
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were synthesized through direct amidation of the commercially available 
carboxylic acids. Substrates 1k-1q, ly, and 1z were synthesized through late- 
stage derivatization or arylation of propanamide (see supplementary materials 
for additional details). Stage | was run at 1.0 mmol scale. {Stage |: 1 (1.5 mmol), 
DG-NHp (2.25 mmol), P205 (18 mmol), TMS20 (18 mmol), and pyridine 

(4.5 mmol) at 160°C for 18 hours. 


impact on the hook-and-slide process, as both 
electron-withdrawing and electron-donating 
groups are compatible. Aryl bromide (4d) or 
thioether (4i), which could potentially react 
with transition metals, were also compatible. 
The DG condensation and isomerization steps 
showed excellent selectivity for amides over other 
reactive FGs, including cyanides (4k), esters 
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(41), sulfonamides (4m), bulkier pivalamides 
(4m), and even diaryl ketones (4aa). Amides 
with other reactive moieties, such as olefins 
(40) and silanes (4p), were also suitable sub- 
strates, albeit with lower condensation yields. 
Substrates with meta-substituents (47, 4s) or 
ortho-substituents (4, 4a1) still exhibited high 
reactivity. The reaction is not limited to phenyl 
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derivatives; other aromatic systems, includ- 
ing naphthalenes (4v, 4:w), thiophenes (4x), 
pyridines (4y), and quinolines (4z), also worked 
well. Finally, different amide moieties other 
than pyrrolidine derivatives were also tested. 
Bicyclic amines (4ac, 4ad, 4ae), piperidine 
(4af ), morpholine (4ag), and piperazine (4ah) 
all reacted smoothly to deliver the desired 
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Fig. 4. Substrate scope of multiple-carbon homologation. Reaction 
conditions: Stage |: 6 (1.5 mmol), DG-NHz (2.25 mmol), PPSE (36 mmol), 

and pyridine (4.5 mmol) at 160°C for 18 hours. Stage Il: 7 (0.2 mmol), 
[Rh(SIDibp)(coe)Cl]z (0.01 mmol), and chlorobenzene (0.5 ml) at 50°C for 

24 hours. Then, (Bu3Sn)20 (0.2 mmol), isopropanol (1.0 ml), and HzO (1.0 ml) 
at 100°C for 36 hours. The two yields are isolated yields for each stage. 
*Substrate 6a was synthesized through direct amidation of the commercially 


products, although the DG installation and/or 
hydrolysis became somewhat less effective for 
substrates with increased steric hindrance. 
Acyclic amine-derived substrates, such as di- 
methylamine (4ai) and N-methylbenzylamine 
(4aj), are also competent. Although the current 
substrate scope requires a-aryl-substituted 
amides at this preliminary stage (for a detailed 
discussion of challenging and unsuccessful sub- 
strates, see section 3 of the supplementary ma- 
terials), aryl acetic acid derivatives are widely 
available and prevalent intermediates in phar- 
maceutical synthesis (4, 34, 35). 


Homologation with longer methylene chains 


Multiple-carbon homologations were next ex- 
amined (Fig. 4). Although the o-alkylation with 
higher alkyl halides occurred smoothly, a num- 
ber of concerns are associated with the long- 
distance migration. First, the increased sterics 
around the amide moiety could hinder the 
DG condensation. Second, a 1,2-disubstituted 
alkene intermediate would be generated, which 
is known to exhibit lower binding affinity than 
the terminal alkene toward the metal catalyst 
(36). Third, as reductive elimination could in 
principle occur at any position along the alkyl 
chain, a complex mixture of isomers could pos- 
sibly be generated. After extensive optimization 
(see tables S2 and S3), the DG condensation 
still proceeded in high yield at an elevated re- 
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DG-NHbz (1.65 mmol), PPS 


and chlorobenzene (1.0 ml) 
isopropanol (2.0 ml), and H 


action temperature with higher PPSE and DG 
loadings. An SIDibp ligand, which replaces the 
ethyl groups in SIDep with zso-butyl groups, was 
discovered to be more efficient for long-distance 
migration. Under optimized conditions, 6a with 
an ethyl substituent selectively gave the linear 
two-carbon homologation product 8a in 60% 
yield. Substrates with different FGs also showed 
comparable reactivities (8b to 8j). The reac- 
tion could be extended to longer methylene 
units. The 3-, 4-, 5-, and 6-carbon homologations 
(8K to 8r) all gave the terminal amides as the 
only products without decrease in yield. En- 
couragingly, 10-carbon (8q) and even 16-carbon 
homologation (8r) could be realized, which 
shows the generality of the hook-and-slide 
strategy. Notably, excellent linear selectivity 
was observed in all examples without any ob- 
vious (<5%) branched intermediates. 

To showcase synthetic utility, the hook-and- 
slide strategy was first applied to preparing 
homologs directly from complex bioactive amides 
(Fig. 5A). Using the AMPA receptor (AMPAR)- 
positive allosteric modulator 9a (37, 38) as the 
common starting material, the corresponding 
one-, two-, three-, and four-carbon homologa- 
tions took place smoothly to give the respective 
homologs in good overall yield. The pyrazole 
moiety that potentially could have acted as a 
competing DG was well tolerated. Compared 


with the reported approach involving individ- 
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available carboxylic acid. Substrate 6f was synthesized through cyanation of 
6c (see supplementary materials for additional details). {Stage |: 6 (1.5 mmol), 


E (24 mmol), pyridine (3.0 mmol) at 140°C for 


18 hours. +Stage II: 7 (0.4 mmol), [Rh(C2H4)2Cl]2 (0.02 mmol), SIDep (0.04 mmol), 


at 80°C for 24 hours. Then, (BusSn)20 (0.4 mmol), 
20 (2.0 ml) at 100°C for 36 hours. §Stage II: 7.5 mol % 


[Rh(SIDibp)(coe)Cl}2 was used for 48 hours. 


ual synthesis for each analog (37), this method 
provides a simple unified approach that avoids 
laborious preparation of different carboxylic 
acid substrates. 

On the other hand, given that carboxylic acids 
can be directly generated using an acidic work- 
up after the C-C activation process, tunable 
homologation of carboxylic acids was next ex- 
plored. As shown in Fig. 5B, direct o-alkylation 
of carboxylic acids is well established through 
the double deprotonation protocol, and se- 
quential addition of pyrrolidine and DG-NH, 
under the previous condensation conditions 
smoothly delivered the amidines from the car- 
boxylic acid substrates in one pot. No inter- 
mediate chromatographic purification was 
required; the mixture after workup was directly 
subjected to the key Rh-catalyzed migration 
step. A final acidic hydrolysis removed all amine 
moieties and gave the desired linear carbox- 
ylic acid products. This procedure worked 
well for a range of carboxylic acids and can 
be used to convert several drug molecules to 
their unbranched analogs (9c to 9e). Although 
several methods have been developed for car- 
boxylic acid homologation (11-13), their ap- 
plication in late-stage skeletal modifications 
is still difficult owing to the harsh conditions, 
toxic reagents, and/or tedious routes for multi- 
carbon homologations. Here, the hook-and- 
slide strategy is a convenient alternative. For 
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example, homologs of isoxepac have been in- 
triguing anti-inflammatory pharmaceutical 
targets (39), but the known synthetic route 
to prepare each isoxepac homolog required 
different phenol substrates (35, 39), some of 
which were not commercially available. Using 
our method, one-, two-, four-, and six-carbon ho- 
mologations were successfully attained starting 
with the commercially available isoxepac as the 
common substrate. Both the ketone carbonyl 
and benzyl ether FGs remained intact during 
this homologation process. Similarly, homo- 
logs of TUG-891 (40, 41), a synthetic G protein- 
coupled receptor 120 (GPR120) agonist for 
potential diabetic treatment, were efficiently 
prepared by this strategy in good overall yields. 


Mechanistic considerations 


The reaction mechanism was explored through 
a combination of experiments and computa- 
tional studies. As shown in Fig. 6A, both B- and 
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a-deuterated substrates gave products with 
deuterium isotopes evenly distributed along 
the aliphatic chain, suggesting the occurrence 
of rapid and reversible B-hydrogen elimina- 
tion and reinsertion processes. The loss of 
deuterium in these reactions is possibly due 
to proton exchange with residual water in the 
solvent. In addition, the crossover experiments 
with two structurally similar substrates yielded a 
mixture of two conserved products, together with 
two crossover products in comparable yields (Fig. 
6B). This result indicates that the coordination 
between the olefin intermediate and the Rh 
hydride is labile enough to allow intermolecular 
exchange, which also explains the formation 
of side-product 3a’ when [Rh(C,H4)sCl]2 was 
used as the catalyst (through the reaction with 
ethylene; see Fig. 2B). 

Density functional theory (DFT) calculations 
predict that the C-C oxidative addition step 
requires an activation barrier of 18.0 kcal/mol 
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with respect to a substrate-coordinated Rh(1) 
complex (Intl), which is about 7 kcal/mol 
higher than the corresponding barrier derived 
from cyclopentanones (42), suggesting that the 
C-C activation of o-branched amides is a rela- 
tively slow step (Fig. 6C). Subsequent B-hydrogen 
elimination leads to an 18e° Rh hydride species 
with a labile coordinated styrene ligand. The 
dissociation of styrene to Int4H is exothermic 
by 3.5 kcal/mol, and this intermediate could 
dimerize to give Int4H-dimer, further favored 
by 4.5 kcal/mol. These stable rhodium hydride 
species are consistent with our findings in the 
deuterium labeling and crossover experiments. 
The reinsertion step shows similar activation 
energy to the initial C-C activation step, fol- 
lowed by the reductive elimination to give the 
final product. According to the computational 
results, all the steps have relatively low bar- 
riers (see fig. S2 for additional details), and the 
relative energy difference between the linear 
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and branched amidines, which is the overall free 
energy change of the reaction, is —4.0 kcal/mol, 
suggesting that the linear product is thermo- 
dynamically more stable. This is the key to driv- 
ing the reaction to completion at 60°C. Other 
possible side reactions, such as benzylic C-H ac- 
tivation or C-N activation, were also considered. 
Although a directed benzylic C-H activation path- 
way seems possible with a slightly higher barrier, 
it generates a less stable intermediate and lacks 
favorable subsequent transformations. In con- 
trast, the C-N activation process is substantially 
disfavored both kinetically and thermodynam- 
ically. For the scenario of multicarbon homolo- 
gations, different reductive elimination transition 
states of '7a were carefully compared (Fig. 6D). 
Unlike the reactions at the benzylic and terminal 
positions (TS-A and TS-C, respectively), reduc- 
tive elimination at internal positions (TS-B) ex- 
hibits much higher activation energy because of 
increased steric hindrance and lack of ben- 
zylic stabilization. Therefore, the overall barrier 
leading to undesired branched intermediates 
(24.0 kcal/mol) is substantially higher than 
that for linear products (19.3 kcal/mol; see 
fig. S3 for additional details), which explains 
the excellent linear selectivity observed in 
multicarbon homologations (Fig. 4). 

The hook-and-slide strategy presented here 
can greatly simplify homolog preparations of 
complex molecules. The use of removable DGs 
for activating unstrained amide C-C bonds, the 
discovery of a well-defined highly active rho- 
dium catalyst, and the mechanistic insights 
gained in this study could have broad impli- 
cations for skeletal editing more generally. 
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Pregnancy-responsive pools of adult neural stem 
cells for transient neurogenesis in mothers 


Zayna Chaker't, Corina Segalada’t, Jonas A. Kretz”, Ilhan E. Acar’, Ana C. Delgado’, Valerie Crotet?, 


Andreas E. Moor’, Fiona Doetsch!* 


Adult neural stem cells (NSCs) contribute to lifelong brain plasticity. In the adult mouse ventricular- 
subventricular zone, NSCs are heterogeneous and, depending on their location in the niche, give rise to 
different subtypes of olfactory bulb (OB) interneurons. Here, we show that multiple regionally distinct NSCs, 
including domains that are usually quiescent, are recruited on different gestation days during pregnancy. 
Synchronized activation of these adult NSC pools generates transient waves of short-lived OB interneurons, 
especially in layers with less neurogenesis under homeostasis. Using spatial transcriptomics, we identified 
molecular markers of pregnancy-associated interneurons and showed that some subsets are temporarily 
needed for own pup recognition. Thus, pregnancy triggers transient yet behaviorally relevant neurogenesis, 
highlighting the physiological relevance of adult stem cell heterogeneity. 


tem cells in the adult mouse brain dy- 
namically integrate and respond to envi- 
ronmental signals throughout life (/, 2). 
Ventricular-subventricular zone (V-SVZ) 
neural stem cells (NSCs) residing along 
the lateral ventricles are radial glial fibrillary 
acidic protein (GFAP)-expressing cells and ex- 
ist in quiescent or activated states (2). NSCs in 
distinct spatial domains of the V-SVZ give rise 
to different subtypes of olfactory bulb (OB) 
interneurons (2). Once integrated, adult-born 
neurons tend to persist long term (3, 4). In 
addition to constitutive neurogenesis, region- 
ally distinct adult NSCs can be modulated by 
physiological states such as hunger and satiety 
(5). However, whether other physiological states 
dynamically control distinct pools of adult stem 
cells and what is the functional relevance of 
on-demand stem cell recruitment for adaptive 
brain plasticity remain to be fully elucidated. 
Pregnancy induces major structural changes 
in multiple brain regions in preparation for 
motherhood and parental care (6). In mice, 
proliferation in the adult V-SVZ increases at 
gestation day (Gd) 7 and again at postpartum 
day (Ppd) 7 (7, 8), and newborn neuron den- 
dritic complexity and synaptic integration are 
enhanced in mothers (9, 0). Perturbing adult 
V-SVZ neurogenesis sometimes results in de- 
fects in maternal behavior depending on the 
timing of stem cell manipulation (7, 8, 10-12). 
Here, we investigated whether spatial and 
temporal control of distinct stem cell pools 
occurs in response to pregnancy to generate 
specific OB interneuron subtypes, which dif- 
ferentially affect olfactory behavior during 
motherhood. Different physiological states may 
therefore regulate regionally distinct stem cell 
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pools, revealing a functional map of adult NSC 
heterogeneity in the V-SVZ. 


Recruitment of regionally distinct NSCs 
during pregnancy 


We first quantified NSC proliferation (GEAP* 
Ki67") on several different days of gestation 
and at Ppd 7.5 (Fig. 1A). Pregnancy did not uni- 
formly enhance NSC proliferation in the V-SVZ. 
Instead, NSC subpopulations in domains that 
tend to be more quiescent under homeostasis, 
such as the ventromedial wall (Fig. 1, B to D), 
the roof (Fig. 1, B and F, and fig. S1, A and C), 
and the dorsomedial corner (fig. S1, A and 
B), were active. By contrast, only some stem 
cell pools residing in more proliferative V-SVZ 
domains, such as the ventrolateral wall (Fig. 1, 
B, C, and E) and dorsolateral wedge (Fig. 1, B 
and G, and fig. SIC), responded to pregnancy, 
but not those in the dorsolateral and inter- 
mediate V-SVZ (fig. S1, A, D, and E). In ad- 
dition to the spatial pattern of recruitment, 
pregnancy-related domains displayed distinct 
temporal dynamics (Fig. 1H). Dividing NSCs in 
the roof and the ventral V-SVZ increased at 
Gd 4.5 and 7.5, respectively (Fig. 1, D to F). 
Conversely, in the dorsolateral wedge and in 
the dorsomedial corner, NSC proliferation in- 
creased on several gestation days (Fig. 1G and 
fig. SIB). All changes were transient and most- 
ly occurred during the first week of gestation, 
except in the dorsolateral wedge (Fig. 1G). 

To visualize proliferation dynamics through- 
out the V-SVZ, we analyzed whole-mount 
preparations of the medial and lateral walls 
for MCM2. Density maps confirmed the tem- 
poral recruitment of regionally distinct domains 
on different gestation days (Fig. 1I and figs. 
S2F and S3F) and revealed increased prolifer- 
ation in more caudal regions in both the me- 
dial (fig. S2) and lateral (fig. S3) walls, which 
are less proliferative in virgins. Pseudopreg- 
nant female mice, which have a similar neuro- 
endocrine response to early pregnancy, shared 
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0 


only some dividing regions in the lateral | Chec 
with pregnant females (fig. S4, A and B). Ir. fi 
medial wall, increased proliferation was only 
observed in pregnant females (fig. S4, C and 
D), suggesting that the medial V-SVZ, as well 
as some regions in the lateral wall, are selec- 
tively responsive to pregnancy. Thus, regionally 
distinct stem cells can be transiently recruited 
in response to different physiological states. 


Pregnancy-associated interneurons 
are transient 


Most adult-born neurons integrate into the 
granule cell layer (GCL) of the main olfactory 
bulb (MOB), with a bias toward the deep GCL 
(13), the connectivity of which differs from the 
superficial GCL (/4). Some interneurons are 
also added to the glomerular layer (GL) (J), 
but very rarely to the mitral cell layer (MCL) 
(15). Adult-born neurons are functionally in- ; 
tegrated 2 to 3 weeks after their generation 
(3, 16). To investigate whether the recruit- 
ment of spatially distinct NSCs during preg- 
nancy results in the addition of specific OB 
interneuron subtypes in mothers, we pulsed 
pregnant female mice once with a thymidine 
analog on different gestation days (Gd 0.5, ‘ 
2.5, 4.5, or 7.5) and analyzed their OBs 20 days 
later, coinciding with the time of birth and 
early perinatal care (Fig. 2, A and B). The dis- 
tribution of analog* neurons within OB layers ‘ 
differed depending on their day of birth (fig. S5, 

A and B). The MCL was the only layer in which 
an increase in new neurons (thymidine analog* 
NeuN’) born on Gd 0.5 and 2.5 occurred (Fig. 2, 
Cto F and I). In the GCL, only newborn neurons 
labeled at Gd 4.5 and 7.5 were increased in 
mothers (Fig. 2, C to F, I, and fig. S5C), largely 
due to incorporation into the superficial GCL 
(fig. S5D). Therefore, pregnancy-related neuro- 
genesis increases in sublayers where fewer « 
new neurons integrate under homeostasis. In ‘ 
the accessory olfactory bulb (AOB) (Fig. 2, B to 
F), where adult-born neurons are implicated 
in reproductive social behavior (17-20), only 
GCL neurons generated on Gd 4.5 and 7.5 in- 
creased (fig. S5, E and F). Pregnancy-associated 
interneurons were functionally integrated into 
existing OB circuitry, as measured by c-fos ex- 
pression (fig. S6, A to D). In the MOB of peri- 
natal mothers [20 days post-analog injection 
(dpi)], the increase in c-fos* analog” cells was 
restricted to the superficial GCL and MCL (fig. 
S6, B and C), underscoring the importance of 
pregnancy-associated neurogenesis in these two 
layers in early motherhood. 

To investigate whether pregnancy-related 
interneurons are transient or long-lasting, we 
assessed their survival 10 days later (30 dpi), 
around periweaning, when pups are increas- 
ingly feeding on solid food and mothers are 
less engaged in maternal care (Fig. 2A). In all 
layers in which neurogenesis was enhanced 
in perinatal mothers, the number of newborn 
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Fig. 1. Dynamic spatial and temporal 
response of adult V-SVZ NSCs to 
pregnancy. (A) Overview of the gestation 
and postpartum periods in mothers. (B) Left: 
Schema of a mouse brain showing location 
of the V-SVZ along the lateral wall (green), 
the medial wall (magenta), and roof of 

the lateral ventricle (orange). Right: Schema 
of a coronal section of the V-SVZ highlighting 
domains analyzed for NSC proliferation. 
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C) Representative images of dividing 
SCs (arrowheads) (GFAP, green 

and Ki67, magenta) in the ventral V-SVZ. 
D to G) Quantification of dividing NSCs 
in regionally distinct V-SVZ domains. 

H) Summary schema of temporal and 
spatial recruitment of V-SVZ stem cell 
domains during pregnancy. (I) Averaged 
CM2 density maps of the medial wall for 
each time point (N = 3). Scale indicates 
intensity. Scale bar, 20 um. MW, medial 
wall; LW, lateral wall; C, caudal; R, rostral; 
D, dorsal; V, ventral. 
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neurons decreased between 20 and 30 dpi 
(Fig. 2, G to I, and fig. S5G). Pregnancy-related 
interneurons in the GCL had been completely 
culled by weaning (Fig. 2G), but in the MCL 
(Fig. 2H) and in the AOB (fig. S5G), a few neu- 
rons born on Gd 7.5 persisted. These results 
reveal that pregnancy triggers the addition of 
transient neurons to distinct layers of the OB, 
coinciding with birth and the early perinatal 
period, which disappear by periweaning. 


Spatial transcriptomics reveals OB remodeling 
in motherhood 


To gain insight into the molecular correlates 
of OB remodeling during the perinatal [post- 
natal day 6 (P6)] and periweaning (P19) periods, 
we performed Visium spatial transcriptomics 
(Fig. 3A and fig. $7, A to F). Unbiased cluster- 
ing of the transcriptomic data resulted in 12 
clusters, which predominantly corresponded 
to anatomical layers, and highlighted sublayer 
molecular differences such as the superficial 
and deep GCL in the MOB (Fig. 3B and fig. S7, 
C and D). Two AOB clusters were present 


(GCL and MCL/EPL) that were distinct from 
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MOB, supporting functional differences be- 
tween these OB structures (27). We validated 
the expression of markers for each cluster, 
some of which were highly specific (fig. S7, G 
and H), using the Allen Brain Atlas in situ data- 
base. These data can be explored using a web- 
based application. 

Comparison of the whole MOB and AOB 
transcriptomes of perinatal mothers to virgin 
mice (fig. S7A) revealed an up-regulation of 
Gene Ontology (GO) processes related to syn- 
aptic remodeling and the generation of neu- 
rons (fig. S8, A and B), which we validated by 
RNAscope of selected candidate genes (Elp3, 
Egri, Nr4a3, and KIf9) (fig. S8C). Most up- 
regulated genes were maintained in mothers 
through periweaning (table S2) and coded for 
ribosome biogenesis and translation, mitochon- 
drial function, and neuronal plasticity (Fig. 3C; 
fig. S8, D to F; and table $2). By contrast, genes 
that were transiently up-regulated perinatally 
and subsequently down-regulated between the 
perinatal and weaning period (Fig. 3C, fig. 
S8D, and tables S1 and S2) were related to 


neuronal development and migration, inde- 
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pendently confirming the transient nature of 
pregnancy-associated neurogenesis, as well 
as circadian rhythm and hormonal responses 
(Fig. 3C and table $2). Pairwise comparison 
of differentially expressed genes selectively 
up-regulated perinatally or at periweaning in 
each cluster (figs. S7A and S9A and table S3) 
revealed the spatial enrichment of specific 
biological processes in distinct OB layers (fig. 
S9, B to D, and table S4), with more extensive 
remodeling perinatally (fig. S9C) than at peri- 
weaning (fig. S9D). 

Two clusters (clusters 4 and 9) were en- 
riched in mothers compared with virgins (Fig. 
3D and fig. S10, A to C), which was confirmed 
in a Sensitivity analysis (see the materials and 
methods, fig. S11, and table S5). Cluster 4 se- 
quencing spots were distributed across the 
outer layers of the OB (superficial GCL/MCL/ 
EPL/GL) (fig. SIOA), whereas cluster 9 was 
localized in both superficial and deep GCL 
(fig. S1IOB). Cluster 4 and 9 enriched genes 
contributed to GO processes related to neuro- 
genesis, gliogenesis, behavior, and blood ves- 
sel remodeling (Fig. 4A, fig. SIOD, and table 
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Fig. 2. Pregnancy-associated OB interneuron 
subtypes are transient. (A) Schema of pulse- 
chase experiment on different gestation 

days and physiological phases corresponding 
to 20 and 30 dpi. (B) Schema of different 
layers in MOB and AOB. (C to F) Fold 

change quantification of analog* 

NeuN* neurons generated on different 
gestation days in distinct OB layers 

compared with matched virgin controls. 

(G and H) Quantification of newly 

generated neurons born on different 
gestational days at 20 and 30 dpi 

in the GCL (G) and MCL (H). 

(I) Representative images of GCL 

and MCL analog* NeuN* cells (pulsed 

at Gd 7.5) in virgin female mice perinatally 
(20 dpi) and at periweaning (30 dpi). 


Fig. 3. Spatial transcriptomic analysis of OB 
remodeling in mothers. (A) 10x Visium spatial 
transcriptomics of MOB and AOB in virgins 
and mothers at perinatal care and periweaning. 
Sequencing spots on sampled OB sections are 
shown (right). (B) Color-coded schema of 
anatomical layers in MOB and AOB (right) 
matching colors in the uniform manifold 
approximation and projection (UMAP) plot 
(left). (©) Network analysis of up-regulated 
genes in perinatal mothers generated using 
STRING software (https://string-db.org/). 
Orange circles indicate the two regulation 

hubs maintained through periweaning, and 
blue circle indicates gene sets transiently 
up-regulated perinatally. See table S1 for list 

of genes. (D) Left: UMAP plots from the 

Shiny app (https://www.rstudio.com/products/ 
shiny/) highlighting mother-enriched clusters 4 
(green) and 9 (purple) in virgin and perinatal 
samples. Right: cluster proportions over 

total number of sequencing spots. 


S4). These are therefore ideal clusters with 
which to identify pregnancy-related interneu- 
ron subtypes. 


Molecular markers of pregnancy-associated 
transient OB interneurons 


To unravel pregnancy-associated interneuron 
markers, we focused on mother-enriched clus- 
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ter 4, because pregnancy-related neurogenesis 
primarily occurs in the superficial GCL and 
MCL (figs. S5D and S6, B to D). Both are layers 
in which adult-generated neurons integrate less 
under homeostasis and for which few molecu- 
lar markers are known (/5). Enriched genes in 
cluster 4 included several known interneuron 
markers, including NPY, CbinI, Calb2, Zicl, and 


https://avxhm.se/blogs/hillO 
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TH (Fig. 4A), the expression of which is detected 
in more sequencing spots in perinatal mothers 
(Fig. 4B). Analog pulse labeling at Gd 4.5 and 7.5 
revealed two distinct subpopulations of super- 
ficial granule cells that increased during the 
perinatal period [neuropeptide Y (NPY)* analog* 
and calretinin (CalR)* analog*], whereas Zicl* 
analog* neurons did not change (Fig. 4, C and E, 
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Fig. 4. Molecular markers 
of pregnancy-associated 
interneurons. (A) GO 
process analysis for genes 
enriched in cluster 4 using 
etacore (https://portal. 
genego.com/). Selected 
candidate interneuron 
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and fig. S10, E and F). In the MCL, we identified 
two new adult-generated interneuron subtypes 
that increased perinatally, NPY‘ analog* cells 
in the inner portion and cerebellin1* (CbIn1*) 
analog* cells in the outer portion (Fig. 4, D 
and E, and fig. S10G). By contrast, analog” cells 
co-stained with Zicl, CalR, or 5T4 did not 
change (Fig. 4D and fig. SIOH). The increase in 
pregnancy-related OB interneurons was tran- 
sient in both the superficial GCL and MCL 
(Fig. 4, C and D). Cluster 4 also includes se- 
quencing spots located in the GL and tyrosine 
hydroxylase (TH)’, calbindin*, and CalR* analog* 
cells, but not NPY* or Zicl*, interneurons in- 
creased perinatally (fig. S12, A to F). In con- 
trast to the superficial GCL and MCL, analog* 
interneurons in the GL were maintained through 
periweaning (fig. S12A). 


Pregnancy-related neurogenesis modulates 
own pup odor recognition 


To investigate the physiological relevance of 
transient pregnancy-associated interneurons, 
we assessed their survival dynamics when 
the perinatal period was prematurely short- 
ened by pup removal at P1 (Fig. 5A) or was 
extended by cross-fostering until P19 (fig. 
S13A). Superficial NPY* and CalR* GCL and 
NPY* and Cbln1* MCL interneurons were all 
lost at P6 in mothers whose pups were re- 
moved at P1 (Fig. 5, B and C). Upon cross- 
fostering with newborn alien pups every 6 days 
until the natural time of periweaning, super- 
ficial GCL interneurons were maintained (fig. 
S13, B, D, and E). However, NPY* and Cbin1* 
MCL interneurons were not (fig. S13, C and D), 
highlighting the functional heterogeneity of 
pregnancy-related interneurons. To investigate 
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whether the maintenance of neurons requires 
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the physical presence of pups, we exposed 
mothers whose pups were removed at P1 to 
alien pup nest odor for 6 days (Fig. 5A). This 
selectively rescued superficial GCL interneu- 
rons (Fig. 5B and fig. S13, D and E), but not 
MCL interneurons (Fig. 5C), in mothers and 
had no effect on virgin mice (Fig. 5, A and B). 
Superficial GCL interneurons were not res- 
cued by virgin female nest odor (Fig. 5, A and B), 
demonstrating that their survival is dependent 
on pup-related olfactory cues. By contrast, 
MCL interneurons were lost after own pup 
removal and were not rescued by either cross- 
fostering or exposure to alien pup odor, sug- 
gesting that their survival may depend on own 
pup odor. 

The dynamic spatial and temporal recruit- 
ment of distinct stem cell domains in the 
V-SVZ makes it challenging to selectively ab- 
late pregnancy-associated neurogenesis. There- 
fore, to determine the functional relevance of 
transient neurogenesis during pregnancy, we 
performed olfactory behavior assays using peri- 
weaning mothers and pup removal as models 
of natural and premature neuronal loss of super- 
ficial GCL and MCL interneurons, respectively, 
and pup odor-rescued mice in which only super- 
ficial GCL interneurons were maintained. 

To assess whether pregnancy-related MCL 
interneurons play a role in own pup odor 
discrimination, we performed habituation- 
dishabituation olfactory tests using virgin fe- 
male odor for learning trials and sequential 
exposure to alien and own pup odor for dis- 
crimination trials (Fig. 5D). Virgins did not dis- 
criminate alien pup odor, but both perinatal 
and periweaning mothers did (Fig. 5D). Peri- 
natal mothers could also discriminate own 
pup from alien odor, but this ability was lost at 
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periweaning (Fig. 5D). Both pup removal and 
pup odor-rescued mothers showed altered own 
and alien pup odor discrimination (Fig. 5D) 
compared with perinatal mothers, indicating 
that MCL interneurons play an important role 
in own pup odor detection, and that main- 
tenance of superficial GCL neurons alone is 
not sufficient to rescue this. Similar results 
were obtained when alien and own pup odors 
were used for habituation-dishabituation (fig. 
S13F). All mice showed normal pure odor 
discrimination when exposed to the synthet- 
ic odor 1-octanol (Fig. 5E), suggesting that 
pregnancy-related neurogenesis is involved 
in pup odor processing rather than global ol- 
factory discrimination. 

In a direct comparison of own versus alien 
P6 pups (fig. S13G), perinatal mothers pre- 
ferred their own pups compared with mothers 
after pup removal (fig. S13H). In individual 
mice, the number of analog’ MCL interneurons 
correlated with own pup visits (fig. SI3D. By 
contrast, the number of analog” superficial GCL 
interneurons did not (fig. S13J). Therefore, the 
amount of neurogenesis in the MCL is direct- 
ly linked to own pup discrimination in mothers. 

To investigate the role of pregnancy-related 
superficial GCL interneurons independent- 
ly of those in the MCL, we used pup odor- 
rescued mothers and presented them with 
P1 alien pups and an inanimate object (fig. 
S13K). The pup exploration index was de- 
creased in mothers after pup removal and 
largely recovered in pup odor-rescued mothers, 
in which superficial GCL interneurons were 
preserved (fig. S13L). Thus, these neurons en- 
hance preference of pups over other objects 
and likely modulate stimulus selectivity in 
perinatal mothers. 
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Fig. 5. Physiological relevance of pregnancy-associated transient neurogenesis. 
(A) Pup removal and odor exposure experimental design. (B and C) Quantification 
of effect of pup removal and social odor exposure on Gd 4.5- or Gd 7.5-born 
pregnancy-associated interneuron subtypes in the superficial GCL (B) and MCL (C). 
(D and E) Olfactory habituation-dishabituation tests to assess discrimination of 


Our findings show that pregnancy-related 
neurogenesis in the superficial GCL and MCL af- 
fects pup odor detection, and that MCL interneu- 
rons specifically are key for own pup recognition. 


Discussion 


Here, we show that the dynamic recruitment of 
regionally distinct NSCs during early pregnancy 
for the generation of diverse OB interneurons 
prepares the maternal brain in anticipation of 
upcoming physiological needs. 

The temporal regulation of spatially distinct 
adult NSCs in different physiological states is 
key to understanding the functional relevance 
of adult neurogenesis. Pregnancy-related inter- 
neurons are transient, integrating into layers 
in which fewer neurons are added under ho- 
meostasis, and are important contributors to 
on-demand OB plasticity. They likely have dif- 
ferent intrinsic properties, including higher ex- 
citability, than fully mature interneurons (22). 
This characteristic of pregnancy-related inter- 
neurons, together with their ephemeral nature, 
constitutes a cellular mechanism for adaptive 
plasticity that is not provided by resident neu- 
rons in the OB. Pregnancy-related interneurons 
may modulate the activity of mitral cells in 
mothers, making them more sensitive to social 
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than pure odors (23). Moreover, we show here 
that own versus alien pup odor discrimination 
is mediated by transient pregnancy-related in- 
terneurons in the MCL, providing a mecha- 
nism for familiar offspring odor recognition 
after parturition described in other mammals 
such as sheep (24). The number of newly gen- 
erated neurons in the MCL correlates with 
own pup preference, raising the exciting pos- 
sibility that amounts of neurogenesis in indi- 
viduals can be linked to specific aspects of 
parental care, perhaps also in males. 

The general principle of transient neuro- 
genesis at different time scales in preparation 
for physiological needs may be conserved across 
evolution. In songbirds and chickadees, sea- 
sonal neurogenesis is linked to seasonal song 
learning and food caching (25, 26). Our find- 
ings reveal shorter time scales over which spe- 
cific neuronal subtypes are added to OB circuitry 
for early-life bonding of mothers to their own 
pup. The culling of these neurons may facil- 
itate the physiological separation from their 
offspring. In humans, OB neurogenesis largely 
ends in infancy (27). Our findings suggest 
that quiescent stem cells may become acti- 
vated during human pregnancy for transient 


neurogenesis and underlie the temporary but 
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pup odor [(D), “(A),” alien, and then “(O),” own pup odors] and a pure odor [(E), 
l-octanol]. Asterisks indicate significant changes between two data points (P < 0.05) 
after ANOVA and post hoc testing with ANOVA. N = 7 virgins, N = 8 perinatal 
mothers, N = 4 to 6 periweaning mothers, N = 6 to 8 mothers after pup removal, 
and N = 6 mothers after pup removal and alien pup odor exposure. 


substantial changes in the sense of smell ex- 
perienced by some mothers. 

The V-SVZ domain of origin of some inter- 
neurons matches stem cell domains that we 
see recruited during pregnancy (15, 28, 29); 
however, those generating the pregnancy-related 
superficial GCL, MCL, and AOB interneurons 
remain unknown (fig. $13M). Moreover, some 
recruited stem cell domains are gliogenic (30), 
raising the possibility that the V-SVZ may also 
contribute to adaptive oligodendrogenesis and 
myelination in mothers (37). Future molecular 
definition of regional stem cell domains will en- 
able more targeted manipulation of NSC pools 
and neuronal subtypes than the pup removal 
paradigm that we used here. Our findings further 
suggest that yet-to-be-identified OB interneu- 
ron subtypes are generated in other physiological 
contexts. It will be important to understand 
how different physiological states coordinately 
regulate the generation versus survival of dif- 
ferent types of neurons and glia. 

Globally, physiological states themselves, 
including pregnancy, and hunger and satiety 
(5), regulate spatially distinct NSC pools for 
on-demand neurogenesis and gliogenesis, de- 
coding the functional relevance of adult NSC 
heterogeneity. 
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Finding the right accommodations 


wo months into my Ph.D. program, I attended a focus group to discuss health and disability in 
graduate school. I asked a professor what accommodations were available to students taking 
their qualifying exams, the oral tests that are a crucial hurdle in many graduate programs. “I 
don’t know if we offer any,” they responded. I was startled. As someone with post-traumatic 
stress disorder (PTSD), I had always received accommodations when taking exams. The quali- 
fying exam wasn’t something I could fail. I soon realized if I wanted to succeed in graduate 
school as a student with a disability, I would need to speak up for the support I needed. 


When I was a teenager, I was diag- 
nosed with PTSD, the result of trau- 
matic events during my childhood. A 
hallmark of the condition is memory 
problems, so when I was a senior in 
high school a teacher suggested I 
may be eligible to receive accommo- 
dations when taking exams. After an 
evaluation by my school’s adminis- 
tration, I was granted extra time 
and a quiet space. The extra time 
allowed me to think through my 
answers, and the space freed me to 
cry if I needed to, which sometimes 
happened when I got overwhelmed 
by anxiety during exams. 

At first, I worried I was somehow 
defective for needing accommoda- 
tions. But over time I learned to 
reframe the situation: The exam mod- 
ifications didn’t mean I was weaker 
or lesser than other students; they simply removed barriers 
that shouldn’t have been there in the first place. 

During my undergraduate studies, I continued to receive 
similar accommodations. When I started graduate school, the 
focus group helped me realize I needed to think about what 
would help me get through my new program because the 
work would be different. Some of my studies involved course- 
work and written exams, but the bulk of my time would be 
spent doing research. During my second year, I would also 
need to pass the 3-hour oral qualifying exam, administered 
by a committee of faculty members. 

Choosing the right thesis adviser would be crucial. My 
program offered rotations in my first year, so I was able to 
experience different mentorship styles before making a deci- 
sion. I didn’t shy away from disclosing my disability. During 
each rotation I told the professor how PTSD impacts my life 
and asked them, “If I continue on in your lab, are you able to 
support me?” 

That helped me select a caring adviser who was willing 
to help. With his support, I met with one of my school’s 


“Exam modifications ... removed 
barriers that shouldn't have 
been there in the first place.” 


graduate disability specialists to 
talk about how best to approach the 
qualifying exam. 

Because I was concerned about re- 
calling facts, the specialist and I dis- 
cussed ways I could jog my memory 
without bringing in materials that 
would give me an unfair advantage. 
She proposed that I take in a bare- 
bones outline of the talk I would be 
giving about my research. She also 
suggested I be given time to write 
down the examiners’ questions and 
the option of asking for the questions 
to be repeated. Once approved by the 
exam chair, her list of accommoda- 
tions was sent to my committee. 

When exam day finally came, tears 
began to well up in my eyes from anx- 
iety. I decided to warn the commit- 
tee that I might get a bit emotional. 
Thankfully, they were supportive. And after giving myself a 
moment to calm down, I did my best to focus on what they 
were asking me. 

I was unable to answer a few questions because of mem- 
ory problems. One question was particularly frustrating 
because it was on a topic I had taught to undergraduates 
earlier that semester, and I knew the answer. Still, I passed, 
and was able to breathe a sigh of relief that I could continue 
with my program. 

Now, I’m on a mission to let fellow students know that ac- 
commodations for disabilities during graduate school do ex- 
ist. Every student with a disability is unique, and all deserve 
to get the help they need. I wrote an essay for my depart- 
ment’s graduate school handbook about how to start the pro- 
cess with our school’s disability center. I also plan to keep 
advocating for my own needs as I progress through my pro- 
gram. Learning to stand up for myself has been an impor- 
tant part of my academic journey. 


Soren Lipman is a Ph.D. student at the University of California, Berkeley. 
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